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Abstract 

We study the g-state ferromagnetic Potts model on the n-vertex complete graph known 
as the mean-field (Curie-Weiss) model. We analyze the Swendsen-Wang algorithm which is a 
Markov chain that utilizes the random cluster representation for the ferromagnetic Potts model 
to recolor large sets of vertices in one step and potentially overcomes obstacles that inhibit single¬ 
site Glauber dynamics. The case q = 2 (the Swendsen-Wang algorithm for the ferromagnetic 
Ising model) undergoes a slow-down at the uniqueness/non-uniqueness critical temperature for 
the infinite A-regular tree (Long et ah, 2014) but yet still has polynomial mixing time at all 
(inverse) temperatures f3 > 0 (Cooper et ah, 2000). In contrast, for g > 3 there are two critical 
temperatures 0 < du < drc that are relevant, these two critical points relate to phase transitions 
in the infinite tree. We prove that the mixing time of the Swendsen-Wang algorithm for the 
ferromagnetic Potts model on the n-vertex complete graph satisfies: (i) 0(1) for /3 < (3u, (ii) 
0(n^/^) for P = du, (iii) exp(n^(^^) for du < d < Prc, and (iv) 0(logn) for d > drc- These 
results complement refined results of Cuff et al. (2012) on the mixing time of the Glauber 
dynamics for the ferromagnetic Potts model. The most interesting aspect of our analysis is 
at the critical temperature P = Pu, which requires a delicate choice of a potential function to 
balance the conflating factors for the slow drift away from a fixed point (which is repulsive 
but not Jacobian repulsive): close to the fixed point the variance from the percolation step 
dominates and sufficiently far from the fixed point the dynamics of the size of the dominant 
color class takes over. 
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1 Introduction 


The mixing time of Markov chains is of critical importance for simulations of statistical physics 
models. It is especially interesting to understand how phase transitions in these models manifest 
in the behavior of the mixing time; these connections are the topic of this paper. 

We study the g-state ferromagnetic Potts model. In the following definition the case q = 2 
corresponds to the Ising model and g > 3 is the Potts model. For a graph G = (P, E) the 
configurations of the model are assignments u : P —)> [g] of spins to vertices, and let Q denote the 
set of all configurations. The model is parameterized by /3 > 0, known as the (inverse) temperature. 
For a configuration cj G fl let m{a) be the number of edges in E that are monochromatic under a 
and let its weight be w{a) = exp(/3m((T)). Then the Gibbs distribution fi is defined as follows, for 
cr G n, fi{a) = w{a)/Z{/3), where Z{f3) = normalizing constant, known as the 

partition function. 

A useful feature for studying the ferromagnetic Potts model is its alternative formulation known 
as the random-cluster model. Here configurations are subsets of edges and the weight of such a 
configuration S E E is 

where p = 1 — exp(—/?) and k{S) is the number of connected components in the graph G' = (P, S) 
(isolated vertices do count). The corresponding partition function Z^-c = satisfies 

Z,, = {l-p)\^\Z. 

The focus of this paper is the random-cluster (Curie-Weiss) model which in computer science 
terminology is the n-vertex complete graph G = (P, E). The interest in this model is that it allows 
more detailed results and these results are believed to extend to other graphs of particular interest 
such as random regular graphs. For convenience we parameterize the model in terms of a constant 
B > 0 such that the Gibbs distribution is as follows: 

= ( 1 ) 


(Note that (5 = — ln(l — B/n) ~ B/n for large n.) The following critical points < 5Src for 

the parameter B are well-studied ^ and relevant to our study of the Potts model on the complete 
graph: 


IBu = sup |i3 > 0 


B-z 


B + {q- l)z 

2(g- l)ln(g- 1) 


A e ^ for all z > 0 }■ = min 
' z>0 


I = min iz+ I 

J z>o { e^ — 1 J 


*B„ = 


g-2 


®rc — Q- 


( 2 ) 

( 3 ) 


These thresholds correspond to the critical points for the infinite A-regular tree Ta and random A- 
regular graphs by taking appropriate limits as A —?■ oo. (More specifically, if B{A) is a threshold on 
Ta or the random A-regular graph then limA-^oo ^(-^(A) — 1) is the corresponding threshold in the 
Curie-Weiss model.) In this perspective, 18^ corresponds to the uniqueness/non-uniqueness thresh¬ 
old on Ta; corresponds to the ordered/disordered phase transition; and was conjectured by 
Haggstrom to correspond to a second uniqueness/non-uniqueness threshold for the random-cluster 
model on Ta with periodic boundaries (in particular, he conjectured that non-uniqueness holds iff 
B G (f8u,®rc))- For a detailed exposition of these critical points we refer the reader to [11] (see 

^23o is Pc in [10, Equation (3.1)] and 23ti is equivalent to fis in [11, Equation (1.1)] under the parametrization 
2 = B(qx — l)/(q — 1). We follow the convention of counting monochromatic edges [10] as opposed to counting 
monochromatic pairs of vertices [11]; hence our thresholds are larger than those in [11] by a factor of 2. 
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also [12] for their relevance for random regular graphs). We should finally remark that in the case 
of the Ising model {q = 2), the three points ®o, ®rc coincide. 

The Glauber dynamics is a classical tool for studying the Gibbs distribution. These are the 
class of Markov chains whose transitions update the configuration at a randomly chosen vertex and 
are designed so that its stationary distribution is the Gibbs distribution. The limitation of local 
Markov chains, such as the Glauber dynamics, is that they are typically slow to converge at low 
temperatures (large B). The Swendsen-Wang algorithm is a more sophisticated Markov chain that 
utilizes the random cluster representation of the Potts model to potentially overcome bottlenecks 
that obstruct the simpler Glauber dynamics. It is formally defined as follows. 

The Swendsen-Wang algorithm is a Markov chain {Xt) whose transitions Xt —)• W-i-i are as 
follows. From a configuration Xt € 

• Let M be the set of monochromatic edges in Xt. 

• For each edge e G M, delete it with probability I — B/n. Let M' denote the set of monochro¬ 
matic edges that were not deleted. 

• In the graph independently for each connected component choose a color uniformly 

at random from [( 7 ] and assign all vertices in that component the chosen color. Let W+i 
denote the resulting spin configuration. 

Recall, the mixing time Tmix of an ergodic Markov chain is defined as the number of steps 
from the worst initial state to get within total variation distance < 1/4 of its unique stationary 
distribution. For the Swendsen-Wang algorithm for the ferromagnetic Ising model on the complete 
graph, Gooper et al. [ 8 ] showed that Tmw = 0{y/n) for all temperatures. Long et al. [17] showed 
more refined results establishing that the mixing time is 0(1) for j3 < I3c, 0 (n^/^) for 13 = Pc, 
and 0(logn) for P > Pc where Pc is the uniqueness/non-uniqueness threshold. For square boxes of 
7?, Ullrich [25] proved that the mixing time of Swendsen-Wang is polynomial for all temperatures 
(building upon results for the Glauber dynamics by Martinelli and Olivieri [19, 20] and Lubetzky 
and Sly [18]). 

For the Swendsen-Wang algorithm for the ferromagnetic Potts model, it was shown that the 
mixing time is exponentially large in n = jUj at the critical point B = *Bo by Gore and Jerrum [13] 
for the complete graph. Cooper and Frieze [9] for G{n,p) for p = n(n“^/^), Galanis et al. [12] for 
random regular graphs, and Borgs et al. [5, 6 ] for the d-dimensional integer lattice for (7 > 25 at 
the analogous critical point. For Z^, Ullrich [25] proves polynomial mixing time at all temperatures 
except criticality building upon the results of Beffara and Duminil-Copin [2]. For the Glauber 
dynamics for the ferromagnetic Potts model on the complete graph, Cuff et al. [11] showed that 
the mixing time satisfies (their results are significantly more precise than what we state here for 
convenience): 0(nlogn) for B < 53^, exponentially slow mixing for B > 53^, and 0(re^/^) mixing 
time for B = 53^ (and a scaling window of around 53^). 

We can now state our main result which is a complete classification of the mixing time of the 
Swendsen-Wang dynamics on the complete graph when the parameter R is a constant independent 
of n. 

Theorem 1. For all integer q > 3, the mixing time Tmw of the Swendsen-Wang algorithm on the 
n-vertex complete graph satisfies: 


1. For all B < 53^, Tmix = 0(1)- 
ForB = ^u, Rnix = 0(ni/^). 
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3. For all < B < 3Brc, 2mix = exp(n^(^)). 

4- For all B > ^rc, Tmw = 0(logn). 

In an independent work, Blanca and Sinclair [3] analyze a closely related chain to the Swendsen- 
Wang dynamics which is also suitable for sampling random cluster configurations (works more 
generally for q > 1 with g G M). They provide an analogue of Theorem 1, though their analysis 
excludes the critical points B = and B = IB^c- 

In the following section, we discuss the critical points 3Bu, 5So, 5Src, present a function F which 
captures a simplified view of the Swendsen-Wang dynamics, and then we present a lemma connect¬ 
ing the behavior of F with the critical points. We also present in Section 2 a high-level sketch of 
the proof of Theorem 1. In Section 5 we prove the slow mixing result (Part 3 of Theorem 1). We 
then prove the rapid mixing results for B > 38rc in Section 7 and for B = 53^ in Section 11. The 
case B = is completed in Section 8 and B < 53^ is done in Section 10. 

2 Proof Approach 

2.1 Critical Points for Phase Transitions 

We review the thresholds ^u,^o,^rc for the mean-field Potts model, the reader is referred to [4] 
for further details which also apply to the random-cluster model. The thresholds 53u,53o,53rc are 
related to the critical points of the following function of the partition function. We first need to 
introduce some notation. For a configuration a : P —>• [g] and a color i G [g], let ai(a) be the 
fraction of vertices with color i in a, i.e., ai{a) = |{u G P : a{v) = i}\/n. We also denote by Q:(cr) 
the vector {ai{a), ..., aq{a)), and refer to it as the phase of a. 

For a q-dimensional probability vector a, let be the set of configurations a whose phase is 
OL. Let 

= V w{a) and T(a) := lim -lnZ“. 

n^oo Ti 

There are two relevant phases: the uniform phase u := (1/q,..., 1/g) and the majority phase 
m := (o, 6 ,..., b) and its q permutations. For the majority phase, a, b are such that a + {q— 1)6 = 1 
and o > 1/g is a critical point^ of 

'l'i(a) := T^a, 6 ,..., 6^ = -olna - (1 - a) In ^^ ^ ^ 

and hence satisfies 

ln^p4k = B(„_(l_o)/(,_l)). (5) 

1 — a 

The thresholds 53^, Q3o, 3Brc relate to the critical points of 'k, see Figure 1 for an illustration of the 
following. For B < 53^J the uniform phase is the unique local maximum of T. For 3Bu < B < ^rc 
there are ^ 1 local maxima: the uniform phase and the q majority phases, and at B = IBo they 

are all global maxima. Finally, for B > ^rc tlie q majority phases are the only local maxima. 

^In the regime t8u < B < Q3rc there are two critical points of 4^1 (a) with value a > 1/q. The value of the majority 
phase is then given by the point where ’l'i(a) has a local maximum. In fact, except for B — t8u, for every B > 

4*1 (o) has a local maximum at the majority phase. See also Figure 1. 
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(d) B = ^. 


(e) ^o< B < ^rc- 


(f) B > 


Figure 1: The function Ti (free energy) plotted in different regimes of B (defined in (4)). The 
critical points ®o) ®rc are given by (2) and (3). In the regime B < (figure la), the function 
ihi has a unique local maximum at the disordered phase. At B = 53^ (figure lb), the function 
'I'l has a saddle point at the ordered phase. In the regime 53^ < B < ^rc (figures Ic, Id and le) 
the function has two local maxima; these are both global maxima iS B = IBq. In the regime 
B > IBrc (figure If), the function Ti has a unique local maximum at the ordered phase and a saddle 
point at the disordered phase. 


2.2 Connections to Simplified Swendsen-Wang 

The following function^ from [1/g, 1] to [0,1] will capture the behavior of the Swendsen-Wang 
algorithm. 

F{z) := - -I- (1 - - ) zx, (6) 

Q \ qJ 

where x = 0 for z < 1/B and for z > 1/B, x G (0,1] is the (unique) solution of 

X + exp{—zBx) = 1. (7) 

The function F captures the size of the largest color class when there is a single heavy color where 
heavy means that the color class is supercritical in the percolation step of the Swendsen-Wang 
process. Hence after the percolation step this heavy color will have a giant component and the 
other color classes will all be broken into small components. So say initially the one heavy color has 
size zn for 1/B < z < 1 and let’s consider its size after one step of the Swendsen-Wang dynamics. 
After the percolation step, this heavy color will have a giant component of size roughly xzn (where 
x is as in (7)) and all other components will be of size O(logn). Then a. Ijq fraction of the small 
components will be recolored the same as the giant component, and hence the size of the largest 
color class will be (roughly) nF{z) after this one step of the Swendsen-Wang dynamics. 

^The argument of F will typically be the density of the largest color class (we could have extended the domain of 
the function F to be the interval [0,1] by further defining the value of F in the interval [0,1/B) to be t/q). 
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Our next goal is to tie together the functions F and 'hi so that we can relate the behavior of 
the Swendsen-Wang dynamics with the underlying phase transitions of the model. We first need 
some terminology. A critical point a of a function / : M ^ M is a hessian maximum if the second 
derivative of / at a is negative (this is a sufficient condition for a to be a local maximum). A 
fixpoint o of a function F : M —>■ M is a jacobian attractive fixpoint if |i^'(a)| < 1 (this is a sufficient 
condition for a to be an attractive fixpoint). 

Lemma 2. The hessian maxima o/^'i correspond to jacobian attractive fixpoints of F. 

Note, the critical points of correspond to fixpoints of F, the only exception is when B > 
and the “uniform” point a = l/q {^i has a saddle point at 1/q, but F does not have a fixpoint at 
1/q since 1/q' > ^/B). Lemma 2 is proved in Section 3.1. 

The behavior of F is the basic tool for proving Theorem 1. Recall the earlier discussion of the 
uniform vector u := (1 /q,..., 1/q) and the q permutations of the majority phase m := {a,b,... ,b). 
The following lemma (proved in Section 3.2) provides some basic intuition about the proof of 
Theorem 1, see Figure 2 for a depiction of the various regimes. 

Lemma 3. Let q > 3. For the function F, 

1. For B < u = Ijq is the unique fixpoint and it is jacobian attractive. 

2. For B = there are 2 fixpoints: u and a where a is defined as in the majority phase m. Of 

these, only u is (jacobian) attractive. The fixpoint a is repulsive but not jacobian repulsive. 

3. For IBu < B < there are 2 attractive fixpoints: u and a where a is defined as in the 
majority phase m. Both of these are jacobian attractive. 

4 . For B = ^rc, both a and u are fixpoints. The fixpoint u is (jacobian) repulsive, while the 
fixpoint a is jacobian attractive. 

5. For B > 3Brc, 0 . is the only fixpoint and it is jacobian attractive. 

The reason that u abruptly changes from a jacobian attractive fixpoint {B < QS^c) to a jacobian 
repulsive fixpoint {B = IB^c) stems from the fact that in the regime B < ^rc, F is constant in a 
small neighborhood around 1/q (precisely, in the interval [1/q, l/R]), which is no longer the case 
for B = IBrc- 

2.3 Proof Sketches 

We explain the high-level proof approach for the various parts of Theorem 1 before presenting the 
detailed proofs in subsequent sections. 

Slow mixing: For Part 3 of Theorem 1, the main idea is that the function F has 2 attractive 
fixpoints (see Lemma 3). At least one of the corresponding phases, u or m, is a global maximum for 
ih. Consider the other phase, say it is u for concreteness. Consider the local ball around u, these 
are configurations that are close in distance from u. The key is that since u is an attractive 
fixpoint for F, if the initial state is in this local ball then with very high probability after one step 
of the Swendsen-Wang dynamics it will still be in the local ball (see Lemma 18, and Lemma 19 for 
the analogous lemma for m) . The result then follows since one needs to sample from the local ball 
around the phase which corresponds to the global maximum of 'll to get close to the stationary 
distribution. The full argument is given in Section 5. 

Fast mixing for B > ^rc- For a configuration a and spin i, say the color class is heavy if 
the number of vertices with spin i is > n/B and light if it is < n/B. If a color class is heavy 
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F(z)-z 


F(z)-z 


F(z)-z 



(a) B < «B„. 



(b) B = 



(c) < S < ^rc- 


F(Z)-Z 



F(Z)-Z 



Figure 2: The drift function F{z) — z, where F is defined by (6), (7). The critical points *Bo, ®rc 
are given by (2) and (3). In the regime B < 53^ (hgure 2a), the function F has a unique attractive 
fixpoint at the disordered phase. At B = 18^ (figure 2b), F also has a (non-jacobian) repulsive 
fixpoint at the ordered phase. In the regime 18^ < B < ^rc (figures 2c), F has attractive hxpoints 
at the ordered and disordered phases. At B = iBrc (hgure 2d), the disordered phase is no longer 
attractive; it is Jacobian repulsive. Finally, in the regime B > 53^0 (hgure 2e), the function F has 
a unique attractive hxpoint at the ordered phase. 


then it is super-critical for the percolation step of Swendsen-Wang and hence there will be a giant 
component. The key is that for any initial state Xq, then with constant probability the largest 
components from all of the colors will choose the same new color and consequently there will be 
only one heavy color class and the other q — 1 colors will be light. Hence we can assume there is 
one heavy color class and q — 1 light color classes, and then the function F suitably describes the 
size of the largest color class during the evolution of the Swendsen-Wang dynamics. Since the only 
local maximum for F corresponds to the majority phase m, after O(logn) steps we’ll be close to 
m - the difference will be due to the stochastic nature of the process. Then it is straightforward 
to dehne a coupling for two chains {Xt,Yt) whose initial states Xq,Yq are close to m so that after 
T = O(logn) steps we have that Xt = Yt- The proof of the upper bound on the mixing time is 
given in Section 7; the lower bound on the mixing time is proved in Section 9. 

Fast mixing for B = ^rc- The basic outline is similar to the B > 53^0 case except here the 
argument is more intricate when the heaviest color lies in the scaling window (for the onset of a 
giant component). We need a more involved argument that we get away from initial configurations 
that are close to the uniform phase; informally, the uniform fixpoint is Jacobian repulsive, so an 
initial displacement increases geometrically by a constant factor. The proof of the upper bound on 
the mixing time is given in Section 8; the lower bound on the mixing time is proved in Section 9. 

Fast mixing for B < *8^: Here the argument is similar to the B > 53^0 case, in fact it is 
easier. The critical point for a giant component in the percolation step is density 1/B. In this case 
we have that B < *8^ and since 18^ < i8rc = q we have that in the uniform phase (which is the only 
local maximum) the color classes are all subcritical. Hence once we are close to the uniform phase 
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all of the components after the percolation step will be of size O(logre). So the basic argument is 
similar to the B > 5Sr-c case in how we approach the local maxima, which is the uniform phase 
in this case. Then once we reach density < IjB then in the next step the configuration will be 
close to the uniform phase and then, using a symmetry argument of [17], we can couple two such 
configurations in one more step. Note that the difference with the regime B > is that we get 
close to the uniform phase abruptly, this is the reason that the mixing time for B < is 0(1)- 
The details can be found in Section 10. 

Fast mixing for B = This is the most difficult part. As in the B > 5Src case with 
constant probability there will be at most one heavy color class after one step. We then track the 
evolution of the size of the heavy color class. The difficulty arises because the size of the component 
does not decrease in expectation at the majority fixpoint. However variance moves the size of the 
component into a region where the size of the component decreases in expectation. The formal 
argument uses a carefully engineered potential function that decreases because of the variance (the 
function is concave around the fixpoint) and expectation (the function is increasing) of the size of 
the largest color class, see Section 11. 


3 Phases of the Gibbs distribution and stability analysis of fix- 
points of F 

3.1 Connection: Proof of Lemma 2 


In this section we prove Lemma 2 presented in Section 2 connecting the critical points of the 
function 'l'i(a) with the fixpoints of the function F, which captures the density of the largest color 
class after one step of the SW-algorithm. 

We first prove the following lemma. The lemma corresponds to the intuitive fact that F{z) 
should be an increasing function of the initial density z and that the rate of increase, i.e., F'{z), 
should be a decreasing function of z. 

Lemma 4. For every B > 0, the function F satisfies F'{z) > 0 and F"{z) < 0 for all z G (1/H, 1], 
i.e., F is strictly increasing and concave in the interval [1/B,1]. 

Proof. We may assume that B > 1 (otherwise there is nothing to prove). Let z G (1/H, 1] and 
recall that x G (0,1) is the (unique) solution of 

X + exp(—zHx) = 1. (7) 


We view (7) as an equation that defines x as an implicit function of z. Differentiating (7) two times 
we obtain 

dx Bxe~^^^ 
dz 1 — zBe~^^^ ’ 

d'^x _ B^ze-^^^ (2e-^^^(l - zBq-^^^) + x) 

dz"^ (1 — zBe~^^^)^ 

These yield 


F'iz) 

F"{z) 


q \ dz J q{l — zBe~^^^)'’ 
q-l f dx d‘^x\ _ {q- l)Bxe-^^^ {zB{x + 2e-^^^) - 2) 

q \dz~^^dz^) q{l-zBe-^^^f 


( 8 ) 
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We first show that F'{z) > 0 for all z G (1/5,1]. Since x is positive for all z > 1/5, it suffices 
to show that 1 — > 0. Since x satisfies (7), we have zB = — ln(l — x)/x, so we have 




(9) 


for all 0 < X < 1 (the inequality holds since the derivative of the numerator is — ln(l — x) and its 
value at X = 0 is 0). Thus F'{z) > 0 for 2 > 1/5. 

We next show that F"{z) < 0 for all 2 : G (1/5,1]. We have already shown that the denominator 
in the expression for F"{z) is positive, so we only need to show that zB[x + — 2 > 0. Using 

again that zB = — ln(l — x)/x, we have 


zB{x + 2e-"^^) - 2 = 


2 x + (2 — x) ln(l — x) 


< 0 , 


X 


for all 0 < X < 1 (the inequality holds since the numerator at x = 0 is 0 and the first derivative of 
the numerator is ^ q 

This concludes the proof. □ 

To prove Lemma 2, we will use the following bound on the coordinates of a critical point of Ti. 

Lemma 5. Let a > 1/q be a critical point of Ti(a). Let 6 = (1 — a)/{q — 1). Then aB > 1 and 
bB < 1. 

Proof. The following parametrization will simplify the expressions: a = {z + \)/{z + q) where z > 0. 
Equation (5) becomes 

zB 

\n{l + z) = - -. ( 10 ) 


For 2 > 0 that satisfies (10) we obtain 

5(2 + 1) 

z + q 


aB = 


z + q 


= (1 + 1 / 2 ) ln(l + 2 ) > 1 , 


and 


65 = (1 — a)B/{q — 1) = ^ = - ln(l + 2 ) < 1. 

z + q 2 


□ 


We are now ready to give the proof of Lemma 2. 


Proof of Lemma 2. We first prove the following; when 5 < the critical points of Ti correspond 
to fixpoints of 5; when 5 > 5Sr-c) the only critical point of Ti which does not correspond to a 
fixpoint of 5 is a = 1 /g. 

To prove this, let a be a critical point of Ti. We use the same parametrization as in the proof 
of Lemma 5, i.e., a = {z + 1)/{z + q) where 2 > 0. Recall that 2 satisfies 


ln(l + 2 ) = 


25 


z + q 

Now, let o be a fixpoint of F. Under the parametrization, equation F{a) = a becomes 

2 


( 10 ) 


z + 1’ 


X = 


( 11 ) 










and (7) becomes 

X + exp ^ = 1. (12) 

Note that when a = 1/q, we obtain from (11) that x = 0. In the regime B > 55^0 we have that 
l/q > 1/B and hence the solution x = 0 to (7) is not valid, cf. the definition of F. Thns, a = Ifq 
is not a fixpoint of F when B > ^rc- 

Plugging (11) into (12) and taking logarithm of both sides one obtains (10). This proves the 
correspondence between critical points of Ti and F stated in the beginning. 

Now we are ready to establish the lemma. We split the proof into cases depending on whether 
a > l/q. We first do the case where a > l/q. Note, from Lemma 5, we also have a > 1/B. From 
( 8 ), we have that at a fixpoint a of T it holds that 



X 

1 — aB exp(—ailx) 


Also, at a critical point a of Ti, we have by differentiating (4) that 


T'/(a) = B 


1 


q — 1 a(l — a) 


(13) 


(14) 


At a critical point/fixpoint we have F{a) = a and (7). We first express exp{—aBx) from (7), plug 
it in (13), then express x from F{a) = a (which yields x = {qa — l)/{{q — l)a)) and plug it in the 
resulting expression. Simplifying the expressions we obtain that at a critical point a 


l-a ^ q-l 


n{a) 


-2 _ B-3- 

l-a q-l 


(15) 


The denominator of (15) is positive (by Lemma 5) and hence at a critical point a we have 


F'(a) < 1 


T'/(a) < 0. 


(16) 


We also have by Lemma 4 (or, alternatively, by the expression in (15)) that T'(a) > 0, so we can 
rewrite (16) as 

|F'(o)| < 1 4^ T'/(a) < 0. (17) 

This establishes the lemma in the case a > 1 /q. 

We next do the case where a = l/q. From (14), we have that a satisfies 'Ll (a) < 0 iff i? < ^re¬ 
in the regime B < ^ro we have that F[l/q) = l/q and hence a = 1/(7 is a fixpoint of F. In the 
upcoming Lemma 9, it is proved that a = l/g is a Jacobian attractive fixpoint of T iff H < IBrc- 
This establishes the lemma in the case a = 1 /q. 

This concludes the proof of the lemma. □ 


3.2 Analysis of the fixpoints of F\ Proof of Lemma 3 

In this section, we prove Lemma 3. 

The following result shows that the local maxima of Ti are hessian and hence Lemma 2 can be 
used to conclude the existence of attractive fixpoints of F. 

Lemma 6. A critical point a > l/q o/Ti for B > has non-zero second derivative. 
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Proof. Let a > 1/g be a critical point of 'Ll such that 'L'/(a) = 0, that is, (14) is zero. Then we 
have 1/g = 1 — Ba{l — a). Plugging the value of q into (5) we obtain 


In ■ 


Ba"^ 


Ba — 1 


(18) 


1 — Ba{l — a) a 
Let w = B — 1/a. Note that by Lemma 5 we have tc > 0. Equation (18) becomes 

ln(l — r(;(l — rc/i?)) = —tc, (19) 

and hence we can parameterize B, a, q in terms of w: 


B = 


w 


1 


e“"' + rc — 1' 


a = 


1 — e“ 


1 e"' + e"’^ - 2 

q = 


w 


e“"' + W — 1 


( 20 ) 


We will now show B < Suppose not, that is, < B. Then, looking at the definition (2), 
there exists B' < B and z > 0 such that 


B' -z 


B' + {q- l)z 


= exp(—z) 


and hence 


B' = z + 

We will now prove that for any z > 0 we have 

B <z + 


a contradiction with (21) and B' < B. 

Our goal (inequality (22) with parametrization (20)) is to show that for any w > 0 and any 
z > 0 


qz 

e^ — 1 

(21) 

qz 

e^ - 1’ 

(22) 


w 


e^ + e-^-2 z 

T<^ + -^-7- (23 

e''' + rc — 1 e'^ + tc — le^ — 1 

Inequality (23) is equivalent to (we multiply both sides by e"”^ + re — 1 > 0 and e^ — 1 > 0 and 
simplify) 

0 < z(e" - e"')(e-"' - 1) - w{w - z)(e^ - 1) =: Gi(w, z). (24) 

We have 

Gi(s + y, 2s) = (s^ - y^)(e^'* - 1) - 2s(e^ - e^)(e® - e"*') =: ^ 2 ( 5 , y). 

We will show ^ 2 ( 5 , y) > 0 for all s > 0 and y > —s. We have ^ 2 ( 5 , —s) = 0 and lim^^oo G 2 {s, y) = 
00 . Thus it is enough to explore the critical points of G'i{y) := G 2 {s,y) for each s. We have 

^Gsiy) = 2e^(s(e^ - e"^) - y(e* - e"^)). 

The function y 1 —)> (e^ — e“^)/y is monotone for y > 0 (this follows from the series expansion) and 
hence the only critical points of G 2 ,{y) are y = 0 and y = ±s. For y = ±s we have G'i{y) = 0. For 
y = 0 we have 

G3(0) = s2(e2^ - 1) - 2s(e* - 1)' = E > 0. 


i=b 


4(i- 1)! 


This establishes non-negativity of Gz{y) for y > —s for all s > 0; which implies (22). 


□ 
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As a consequence of the existence of local maxima of 'hi for B > and Lemmas 2 and 6 we 
obtain that F has a non-uniform hxpoint. The Swendsen-Wang algorithm will tend to get stuck 
around this hxpoint (we show in Section 5). 

Corollary 7. For B > 53^ the function F has a Jacobian attractive fixpoint a where a > 1/q (the 
value of a depends on B). 


The following lemma establishes that, for B = 18^, the spectral radius of F at the hxpoint a 
is equal to 1. Thus, the hrst-order derivative is not sufficient to study the attractiveness of the 
hxpoint a. 

Lemma 8. For B = 53^, it holds that F'{a) = 1. 

Remark 1. Note, the non-attractiveness of the fixpoint a for B = Q3„ follows from F'{a) = 1 and 
F"{a) 0 (Lemma f). 

Proof. From (15), it suffices to show that 'hi(o) = 0. We will need to get an explicit handle 
on 18^. Consider the function f{z) := z for z > 0 and recall from dehnition (2) that 

i8„ = mmz>of{z). It is not hard to see that the minimum is uniquely achieved at z = Zq > 0. 
Thus, we have 53^ = f{zo) and f'{zo) = 0. 

From i8„ = f{zo), we obtain that 


B - z„ 


B + {q- l)zc 

Bq 


or Zq = In 


B -\-{q- l)zo 
B - Zo 


(25) 

that 


From f'{zo) = 0, we obtain e ^° = • Equating the two expressions for e shows 

Zo satishes 

-zl{q-l) + ZoB{q-2) + B{B - q) =t). (26) 

Now we are ready to prove that 'I''/(a) = 0. The observation here is that there is a correspondence 
between Zq and a given by the transformation Zq e-(under the transformation, the second 
equation in (25) becomes (5)). The inverse transformation a i—)• transforms ^i(a) = 


lii) from (14) into ^ fr°“ 


q—1 a(l — a) ' (q—l)(B—z){B-\-qz—z) 

Finally, we classify the attractiveness of the uniform fixpoint u = 1/q of F for B < fBr 


□ 


Lemma 9. For all B < fSrc, the fixpoint u = 1/q of F is Jacobian attractive. For B = the 
fixpoint u = 1/q is Jacobian repulsive. 

Proof. For B < fBrc, we have that F is constant throughout \l/q,l/B], so trivially F'{l/q) = 0 
and hence u is Jacobian attractive. 

For B = fBrc, rewrite (7) as 


zq = f{x), where f{x) := ———. (27) 

x 

Note that as x J, 0, we have z 1/q. Then, for all sufficiently small x > 0, an expansion of / around 
X = 0 yields 

It is not hard from here to conclude 


X = 2q{z - 1/q) -h 0{{z - 1/g)^), 
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for all 2 : in a small neighborhood of 1 jq. It follows that 


F'(l/(?) = 2(g-l)/g>l, 


for all g > 3, and hence u is Jacobian repulsive. □ 

We are now set to prove Lemma 3 from Section 2. 

Proof of Lemma 3. The statements for the attractiveness of the hxpoint u = 1/q ol F follow from 
Lemma 9. The Jacobian attractiveness of the hxpoint a > 1/q ol F follows from Corollary 7. 
Finally, Lemmas 4 and 8 show that, at i? = the hxpoint a > 1/g of T is not attractive (see 
also Remark 1). □ 

Finally, we prove the following lemma which will be used in the proof of the rapid mixing results 
for B > IBrc- 

Lemma 10. For B > ^rc, there is a unique fixpoint of F with a> 1/q. 

Proof. From the hrst part of Lemma 2, it suffices to prove that there is a unique critical point of 
'I'l with a> 1/q. We have 


(a) = — In 
T;'(a) = B 


{q - l)a 
1 — 0 


B \ a — 


1 


1 — 0 


q — 1 0(1 — o) 


Consider the polynomial p{a) = a? — a + and note that for all o G (0,1), p{a) ■ 'I'i(o) < 0 and 
'I''/(o) = 0 iff p{a) = 0. When B > ^rc, there is at exactly one zero of p{a) in the interval {1/q, 1), 
say oo, and another one in the interval (0, 1/q]. To see this for B > IBrc, observe that p(0) > 0, 
p{l/q) = ^ ^ ^ ^ ~ ®rc) observe that the root of p in the interval 

(0, 1/q] is 1/q since p{l/q) = 0. Further, since p'{l/q) = 2/q — 1 < 0 (here we used that q > 2), we 
have that for small enough e > 0, it holds that p'{e + 1/q) < 0 and hence, using that p{l) > 0, we 
also have a root in the interval {l/q,l). 

Thus, 4'i(o) is increasing for 1/g < o < oq and decreasing for oq < o < 1. Since 'I'J(l/g) = 0 
and 'I''^(o) I —00 as o t 1) there exists exactly one critical point with a > 1/g for R > ‘^rc- This 
establishes the lemma. □ 


4 Random Graph Lemmas 

In this section, we collect relevant results from the literature for the sizes of the components in 
G{n,p) where p ~ 1/n. We will use these to analyze the percolation step of the SW algorithm. 

For G ~ G{n,p), we denote by Gi,G 2 ,... the connected components of G in decreasing order 
of size, where we will refer throughout to the size of a component as the number of vertices in 
it. We denote by ]G] the size of a component G, i.e., the number of vertices in G. We will be 
particularly interested in the size of the largest component and the sum of squares of the sizes of 
the components, since these control the expected size of the largest color class and the variance of 
the percolation step of the SW algorithm. 
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4.1 The supercritical regime 

We will need several known results on the G{n,p) model in the supercritical regime (p = c/n, where 
c > 1). The size of the giant component is asymptotically normal [24]. We will use the following 
moderate deviation inequalities for the sizes of the largest and second largest components of G. 
These are used to track the evolution of the SW-dynamics for an exponential number of steps in 
the slow mixing regime < B < ^rc- 

Lemma 11. Let G ~ G{n,c/n) where c > 1. Let (3 G (0,1) he the solution o/x + exp(—cx) = 1. 
Let Cl, (72 be the largest and second largest components of G respectively. Then, for every constant 
e G (0,1/3] it holds that 

P{\\Ci\- Pn\ > n^/2+e) < exp(-0(n2")), (28) 

P(|C 2 | > n^) < exp(-0(n^)). (29) 

Proof. Equation (28) is proved in [17, Lemma 5.4], We next prove equation (29). All the elements 
are contained in the proof of [15, Theorem 5.4]. The probability that there exists a component of 
size from the interval {n^,... is bounded by (see [15, p. 110, line 11]): 

exp(—((c — l)^/(9c))n^). (30) 

The probability that there exist two or more components of size at least is bounded by (see 
[15, p. no, line 24]): 

exp(—((c — l)^c/4)n^'^^). (31) 

Using the union bound (combining (30) and (31)) we obtain (29), that is, with high probability we 

have only one component of size > n^. □ 

The following lemma will be used to analyze the evolution of the SW-chain when B = 

Lemma 12. Let G ~ G{n,c/n) where cq < c < ci for absolute constants co,ci > 1 (c may 
otherwise depend on n). Let Gi he the largest component in G. Then, for every constant e > 0, for 
all n sufficiently large it holds that 

nfi — < E[|Ci|] < n/3 + n^, (32) 

where /3 G (0,1) is the (unique) solution of (3 + exp(—/3c) = 1. 

Also, there exist constants Ki, K 2 , > 0 (depending on co,ci/ such that for all sufficiently 

large n it holds that 


Kin < Uar[|Ci|] < K 2 n, 


i>2 


< K^n. 


(33) 


Proof. The first two parts of the lemma as well as the first two inequalities in (33) can be found in 
[7, Theorem 5]. The last inequality in (33) is an immediate corollary of [17, Corollary 5.6]. □ 

4.2 The scaling window &; subcritical regimes 

We use the following well-known result about the size of the giant component in the subcritical 
regime. 
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Lemma 13 (see, e.g., [15], p.l09). Let t G (0,1] be a eonstant. Let G ~ G{n,c/n) where c < 1. 
Let Gi he the size of the largest component of G. 


Pi\Ci\ > n*) < exp(-0(n*)). 

The following lemma considers the size of the components in the scaling window. 
Lemma 14. There exist constants K, c, c' > 0 such that for any n and 

1. any e G (0,1) for random G from G{n, (1 — e)/n) we have 


E 


[Ei« 


< 


Kn 


i>l 


2. for any e G [l/n^/^,c] for random G from G{n, (1 + e)/n) we have 


E 


[Ei«i 


< 


Kn 


i>2 


3. for any e G [c'c] for random G from G{n, (1 + e)/n) we have 


P 


< (7/4)en] U [jCij > 3en]^ < Kexp{—c£^n). 


Proof. Part 1 follows from [17, Lemma 5.3 & Theorem 5.12]. Part 2 is [17, Theorem 5.13, Part 
(ii)]. Part 3 follows from [17, Lemma 5.4 &: Theorem 5.9]. □ 

Lemma 15. Let G ~ G{n,p), p > (I — An~^P)/n where A is a large constant. Let Gi,G 2 ,... 
be the connected components of G in decreasing order. Then, for all sufficiently large constant 
L > 0, there exists a positive constant p' such that for all n sufficiently large it holds that P(|Ci| > 
LnPf E,>2lCiP >p'. 

The proof of Lemma 15 is based on [17, Proof of Lemma 8.26]. We will use the following special 
case of [15, Theorem 5.20]. 


Corollary 16 ([15, Theorem 5.20]). Let t be a positive integer and d,ai,..., at, bi,... ,bt be such 
that oo > ai > bi > 02 > b 2 >...> at > bt > d > 0. Let c be a constant (not necessarily positive) 
and let p = {1 + cn~^P) fn. 

For G ~ G{n,p) denote by Gi,G 2 ,... the connected eomponents of G in deereasing order of 
their sizes. There exists t := l{c,t,d,ai,... ,at,bi,... ,bt) > 0 sueh that for all sufficiently large n, 
it holds that 


Pi ai > 


|Ci| 


n 


2/3 


> bi,... ,at > 


\Ct 


n 


2/3 


>bt,d> 


|gt+i| \ 

„ 2/3 J 


> 


Proof. The statement of [15, Theorem 5.20] is for the Erdos-Renyi random graph model G{n,M) 
with M = (n/2) + cr?!'^. Since for G G{n,p) with p = (1 + 2cn ^F)/n the number of edges is 
(n/2) + cn^/^ + 0{y/n) with probability 11(1), the corollary follows. □ 


Proof of Lemma 15. Let ^ > 0 be a large constant. We consider two cases. If p > (1 + An ^F)/n, 
we have from Part 2 of Lemma 14 that jC/P] E Kn^F j A, so by Markov’s inequality 


p(Eig 

i>2 


j\ — 


< n 


4/3 


K 
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From Corollary 16 (with t = 1, bi = L) we obtain that for p = 1/n, ICil is greater than 
with asymptotically positive probability pi for any constant L > 0. Note that for p > 1/n we can 
couple G ~ G{n, 1/n) and G' ~ G{n,p) so that G is a subgraph of G'. Since |Ci| is monotone, it 
follows that for p > 1/n, |Gi| is greater than with positive probability pi. Provided that A 

is sufficiently large (depending on K,pi), by a union bound we have that 

p(|Ci| > Ln^/^ < n^/3) > 1 - ^ - R > 0. 

1>2 

If (1 — An~^^^)fn < p < (1 + An~^^^)jn, we have from Corollary 16 (with t = 1, d = 1, bi = L) 
and the argument in [17, Proof of Lemma 8.26] that for all sufficiently large L, it holds that 

p(|Ci| > Ln2/3, ^ |Cjf < n^/3^ > P2 > 0, 

j >2 


where p 2 is a constant. The lemma follows. □ 

We will also use the following upper bound on the size of the giant component in the critical 
window. 

Lemma 17 ([23, Corollary 5.6], see also [21, Theorems 1 &: 7]). Let G ~ G{n,p) with p = 
(1 ±cn“^/^)/n where c is a sufficiently large constant. Let Ci be the largest component in G. Then 
there exists a constant r > 0 such that for positive A larger than an absolute constant, it holds that 

P(|Gi| > An2/3) < exp(—rA^). 


5 Slow Mixing for 25^ < B < ^rc 

Let B{v,5) be the t'oo-ball of conhguration vectors of the g-state Potts model in Kn around v of 
radius 6, that is, 

B{v, 6) = {w € TA I ||w/n — v||oo < (5}. (34) 

We will show that for B < the Swendsen-Wang algorithm is exponentially unlikely to leave 
the vicinity of the uniform configuration. 

Lemma 18. Assume B < iBrc- There exists Sq > 0 such that for all e G (0,eo) for S = B{u.,e) 

Pr(Wi gS\XoGS)>1- exp(-0(n^/2))_ 

The reason for Lemma 18 failing for B > is that the first step of the Swendsen-Wang 
algorithm on a cluster of size n/q yields linear sized connected components, and these allow the al¬ 
gorithm to escape the neighborhood of u (a somewhat similar argument applies for B = as well, 
though in this case one has to account more carefully for the fluctuations of the largest components 
since the percolation step of the SW-dynamics is in the critical window for such configurations). 

Proof of Lemma 18. Let Xq G S. The first step of the Swendsen-Wang algorithm chooses, for each 
color class, a random graph from G{m,p), where p = B/n and m is the number of vertices of that 
color. For all sufficiently small e we have 

B m d 

P = -< —, 

m n m 
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where d < 1 (we used B < q and m < n/q + en). Now Lemma 13 (with t = 1/2) implies that with 
probability at least 

1 — nexp(—0(n^/^)) (35) 

all components after the first step have size < The second step of the Swendsen-Wang 

algorithm colors each component by a uniformly random color; call the resulting state Xi. Let Zi 
be the number of vertices of color i in Xi. By symmetry E[Zi] = njq. 

Now assume that all components have size < Then by, say, Azuma’s inequality (see, 

e.g., [15], p. 37) 

Pr(|Zi — n/q\ > en) < exp(—0(n^/^)), (36) 

and hence Pr(Ai G S') > 1 — nexp(—0(n^/^)), which combined with (35) yields the Lemma. □ 

We also analyze the behavior of the algorithm around the majority configuration (for the con¬ 
figuration to exist we need B > iB^). 

Lemma 19. Assume B > 53^ and let m = {a,b,... ,b) where a > 1/q is the attractive fixpoint of 
F of Lemma 3. There exists constant Sq > 0 such that for all constant e G (0,eo) for S = 
we have 

Pr(Ai G S I Ao G S) > 1 - exp(-0(n^/3))_ (37) 

In fact, the conclusion (37) holds for e = as well. 

Proof of Lemma 19. Let Xq G S and let 7 := T'(a) (recall that jyj < 1, since a is Jacobian 
attractive fixpoint by Lemma 3). The first step of the Swendsen-Wang algorithm chooses, for 
each color class, a random graph from G{m,p), where p = B/n and m is the number of vertices 
of that color. Let mi be the number of vertices of the dominant color. Since Xq G S we have 
mifn = a + T =\ a' where jrj < e. We can write 

p = {miB/n)/mi = {a B)/mi, 

where a'B > 1 for sufficiently small eo > 0 (using aB > 1 from Lemma 5). This means that the 
G{m,p) process in this component is supercritical. Let jd G (0,1] be the root of x-t-exp(—a'Ux) = 1. 
By Lemma 11 the random graph will have, with probability > 1 — exp(—0(n^/^)), one component 
of size a' jdn ± and all the other components will have size at most 

Let m2 be the number of vertices in one of the non-dominant colors. Since Xq € S we have 
m 2 /n =: b' where 

b — eq <b — e <b' <b + e <b + eo. (38) 

We can write 

p = {m 2 B/n)/m 2 = {b'B)/m 2 , 

where b'B < 1 for sufficiently small Sq > 0 (using bB < 1 from Lemma 5). This means that the 
G{m,p) process in this component is subcritical. By Lemma 13 (with t = 1/3), with probability 
> 1 — exp(—0(n^/^)) the random graph will have all components of size at most 

To summarize: starting from a configuration in S after the first step of the Swendsen-Wang 
algorithm we have, with probability > 1 — gexp(—0(n^/^)) one large component of size a'/3n±n^/^ 
and the remaining components are of size < (small components). In the second step of the 
algorithm the components get colored by a random color. By symmetry, in expectation each color 
obtains (n — a'fdn =F r?l'^)lq vertices from the small components and by Azuma’s inequality this 
number is (n —a'/3n=Fn^/^)/g±n^/® with probability > 1 — exp(—0(n^/^)). Combining the analysis 
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of the first and the second step we obtain that at the end with probability > 1 — 
we have 


a(Xi+i) - (F(a'), 


1 - Ha') 

q-1 


^-Fja'h 

q-l ) 




2gexp(—0(n^/^)) 
(39) 


For sufficiently small eo > 0 there exists 7' G (7, 1) such that for all |r| < eo we have |-F(a + 
r) — o| < 7'r. For sufficiently small eo > 0, for all e G (0, eo) and all sufficiently large n we have 


\F{a')±2n-^/^ 


a| < e and 


l-F{a') 

q-l 


± 2n"^/® - b 


< e. 


(40) 


Combining (39) and (40) gives that Xi G 5* with probability at least 1 — exp(—n(n^/^)), which 
finishes the proof of the first part of the lemma. In fact, since = o(n“^/'^), (40) also holds for 

e = as well. This shows the second part of the lemma, concluding the proof. □ 


Combining Lemmas 18 and 19 we obtain Part 3 of Theorem 1. 

Corollary 20. For B G (*Bu,®rc) the Swendsen-Wang algorithm has mixing time exp(n(n^/^)). 

Proof. For some small constant e > 0, let 5u = i3(u,e),5ni = 13(m,e). By choosing e sufficiently 
small, we can ensure that S'u n Sm = 0 and that for some constant C > 0 it holds that 

Pr(Xi G 5u I Xo G 5u) > 1 - exp(-Cn^/3), Pr(Xi e Sm \ Xq e S^) > 1 - exp(-Cni/3). (41) 


Let be the stationary distribution of the SW chain, i.e., /r is the Potts distribution given in 
(1). Let S := arg min{^(5u),/.i(5'm)}, so that ^(5) < 1/2. Let Xo G S and T = exp(C'n^/^). 

Then, using (41), we have that 

Pr(Xr G 5) > (1 - exp(-Cn^/3))^ > 1 - rexp(-Cn^/3)) > 9/10. 

Observe now that 

dTv{XT,h) = max|^(^) -Pr(Xr G A)| > \n(S) - Fr^Xr G 5)| > J 

Acn 2 10 4 

It follows from the definition of mixing time that Tmix > T, as claimed. □ 


6 Basic rapid mixing results 

Once the phases align then it is straightforward to couple the chains so that the configurations 
agree. The following lemma is essentially identical to [8, Lemma 4], which is also used in [17, 
Lemma 4.1]. 

Lemma 21 ([8], Lemma 4). For any constant B > 0, for all q > 2, all e > 0, for T = O(logn) 
there is a coupling where Pr [Xt / hr | Q=(-^o) = Q:(Lb)] < e. 

For completeness we include the proof of the lemma. 

Proof of Lemma 21. Let At = {v : Xt{v) = Yt{v)} and Dt = V \ At. We will define a one-step 
coupling which maintains a{Xt) = a(Yt) and where 

E[\Dt+i\\Xt,Yt] = {l-l/q)\Dt\. (42) 
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We’ll define a matching t : V ^ V. For v £ At let t{v) = v. For V \ At define r so that for all 
V £V, Xt{v) = Yt{T{v)). In words, r matches vertices with the same color (this is always possible 
since OL{Xt) = a{Yt)) and it uses the identity matching on those vertices whose colors agree in the 
two chains. In the percolation step of the Swendsen-Wang process, first perform the step for chain 
Xt, then for Yt for a pair u,rc where Yt{v) = Yt{w) we delete the edge iff the edge {t{v),t{w)) 
is deleted. Therefore, the component sizes are identical for the two chains and we can couple the 
recoloring in the same manner so that A v £ At then v £ At+x and (42) holds. Then, by applying 
Markov’s inequality, 

Pr [Xt / Yt I Xo, To] < n(l - l/qf < e 

for t = O(logn). □ 

It is enough to get the phases within 0{y/n) distance and then there is a coupling so that with 
constant probability the phases will be identical after one additional step. 

Lemma 22 ([17], Theorem 6.5). Let B > Let Xq,Yo be a pair of configurations where 

||q:(Xo) — mjjoo < Ln~^l‘^ ^ ||q:(Fo) — mjjoo < Ln~^/‘^, for an (arbitrarily large) constant L > 0 . For 
all sufficiently large n, there exists a coupling such that with prob. 0(1), a{Xi) = q:(Ti). 

Once again, for completeness we include the proof of the lemma. 

Proof. Our proof closely follows the approach in [17, Theorem 6.5] (which is for the case q = 2) 
with small differences in some of the technical details. 

Apply the percolation step of the Swendsen-Wang algorithm independently for the chains Xq 
and Tq. By Lemma 5.7 in [17], there is a constant C > 0 such that with probability 1 — 0(l/n), 
there are > Cn isolated vertices in each chain (i.e., components of size 1). Our goal will be to couple 
the colorings of the components using the Cn isolated vertices to guarantee that q:(Ai) = q:(Ti). 

In each chain, order the components by decreasing size. Next, couple the coloring step so that 
the largest component in each chain gets the same color. For the remaining components, color 
them independently in each chain in order of decreasing size, but leave the last Cn components 
uncolored. As noted earlier, the remaining Cn uncolored components in each chain consist of all 
isolated vertices (with probability 1 — 0(l/n)). Let Xi, Yi denote the configuration except on these 
Cn uncolored components and denote by Xi,yi be the number of vertices which are assigned color 
i under Xi , Ti respectively. 

We will show that under this coupling, for a (large) constant L' > 0, with probability 0(1), it 
holds that 

\xi — Vil < L'y/n for alH = 1,..., g. (43) 

We will do this shortly, let us assume (43) for the moment and conclude the coupling argument. 
For i £ [g], let it := Xi — yt, so that |4| < L'^/n and denote by i the vector with coordinates 
ii,. ■. ,iq. Further, denote the remaining Cn uncolored vertices as vi,... ,vcn- Let Zt be the r.v. 
which denotes the number of vertices from vi,, vcn that get color i in Xi, and let Z[ denote the 
respective r.v. for Ti. We will couple Z := (Zi,..., Zq) with Z' := (Z(,..., Z') so that 

Pr(Z = Z'-k£) = 0(1). (44) 

From this, we clearly obtain a coupling such that with probability 0(1) we have afiXf) = ai(Ti) 
for i £ [q]. The coupling in (44) is nearly identical to the one used in [17, Lemma 6.7], we give the 
details for completeness. 


18 


Consider W := (VCi,..., Wq), where W follows the multinomial distribution Mult(C'n, -)) 

and note that Z, Z' have the same distribution as W. Let 


/(t) := J w = (lei, ...,Wq)£Z'^ 


Wl,..., Wq£ 


Cn ^ Cn 

- t^/n, -h 

q q 


, Wl + . . . + Wq = Cn 


Standard deviation bounds (or, alternatively, using Stirling’s approximation) yield that, for every 
constant t > 0, for w = (rci,..., Wq) G I{t), it holds that 


Pr (W = w) 



(45) 


for some absolute constant Cq > 0 (depending only on q,C,t). 

The coupling /x of Z, TJ will be defined to be optimal on pairs of the form (w, w + £) with 
■w G More precisely, for w = (rci,..., Wq) G we set 

/i(Z = w, Z' = w + £) := min { Pr (W = w) , Pr (W = w + £)} > (46) 


where in the last inequality we used (45) (recall that the coordinates of i are bounded in absolute 
value by 0{^/n)). For pairs (-w, w') ^ {(w, w + £) | -w G I{L')}, the coupling is independent (the 
construction is analogous to the one used in the proof of the Coupling lemma, see [16, Section 4.2]). 
Now note that 

^{Z = Z' + e)> Yj = w,Z'= w + £) = f7(l), 

wG/(L^) 

where in the last inequality we used (46) and the fact that the number of w in I{L') is n((-y/n)'^“^). 
This proves (44) with the coupling /x, and hence, modulo the proof of (43) which is given below, 
the proof of Lemma 22 is complete. 

To prove (43), we may assume w.l.o.g. that the largest component received color 1 (in each of 
the chains, by the coupling). Let n' = n—Cn and denote by Ci^x, Ci^y the largest components after 
the percolation step of the SW-dynamics on Xq, Yq respectively. Crucially, since the configurations 
Xq and Yq are close to m, by Lemma 12, we have with probability 0(1) that 


IIC'i.xl — |C'i,y|| < K^y/n 


for some (large) constant Kq > 0. We will further show that for a (large) constant Ki > 0, with 
probability 0(1), it holds that 




< Kiy/n and 


Xi - 


n' — \Ci^x\ 


< Kiy/n for i / 1, (47) 


and, by an identical argument, the analogous inequalities for the yiS. Combining these, we obtain 
(43) with L' = 2{Ko + Ki). 

It remains to show (47). Consider the configuration Xq. W.l.o.g. we may assume that color 1 
induces the largest color class in Xq, so that the assumption ||q:(Xo) — mjjoo < translates 

into 

|ai(Xo) — a| < ^ |ai(Xo) — 6| < for i / 1. 

From this, we have that color 1 is supercritical in the coloring step of the SW-dynamics, while 
the colors 2,...,g subcritical. Let C\^C 2 i - ■ ■ be the components in decreasing order of size after 
performing the percolation step in Xq and note that Ci = Ci^x- We have 


e[Eigi" 

i>2 


< Kn 
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for some absolute constant K > 0. To see this, use the bound in Lemma 12 and equation (33) 
for the (supercritical) color class 1 and Lemma 14 Item 1 for each of the (snbcritical) color classes 
2,... ,q. By Markov’s inequality (and restricting our attention to components other than the 
isolated vertices {ui,... ,vcn}) we obtain that with probability 0(1) it holds that 


E \Cj\^<K'n 

i>2; Cjj^{vi},...,{vcri} 


(48) 


for some absolute constant K' > 0. Now, for z = 1,..., g let Jj be the number of vertices colored 
with i among the vertices other than ui,..., vcn and those that belonged to the component Ci^x- 
Note that 

xi = \Ci^x\ + Ji-, Xj = Ji for i / 1. (49) 

Observe that E\Jj\ = (re' — \Ci^x\)/<l- Further, using (48) and Aznma’s inequality, we obtain that 
with probability 0(1) it holds that 


Ji-^ 


n'-\Ci,x\ 


< K x/n for i = 1,..., g. 


(50) 


for some absolute constant K" > 0. Combining (49) and (50) yields (47) (with Ki = K"), as 
wanted. 

This completes the proof of (43) and hence the proof of Lemma 22. □ 


7 Fast mixing for B > ^rc 


Let e > 0. We say a color i is e-heavy if a* > (1 + £)jB. We say that a color is e-light if 
ai < (1 — e)/B. For a state Xt, we denote by St the size of the largest color class in Xt. We 
will show that the SW-algorithm has a reasonable chance of moving into a state where one color is 
e-heavy and the remaining q — 1 colors are e-light. 


Lemma 23. Assume B > is a constant. There exists e > 0 such that the following hold. For 
any re and any initial state Xq with probability 0(1) the next state Xi has one e-heavy color and 
the remaining q — 1 colors are e-light. Further, if Xq has one e-heavy color and the remaining q — 1 
colors are e-light, then the same is true for Xi with probability 1 — o(l). 


Before proving Lemma 23 we will need the following function. Let g : [0,1] —[0,1] be such 
that g{z)n is the size of the giant component in G{zn, B/n) (as re —>• oo). For z < 1/B we have 
g{z) = 0; for z > 1/B we have g{z) = zx, where x is the largest solution of x -|- exp(—zBx) = 1 
in [0,1]. Note that in the interval (1/B,1] the fnnctions F and g are connected by the relation 
F{z) = \ + {l-\)g{z). 

We have the following inequalities. 


Lemma 24. Assume B > Then 




Lemma 25. The minimum of 




(51) 


(52) 


2=1 


over non-negative ai’s that sum to 1 is achieved for 02 = as = • • • = = 1/B and ai = 1—(g—1)/B. 
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Proof of Lemma 24- Let z := 1 — {q — 1)/B. First note that z > 1/B since B > q. Let x € (0,1) 
be the solution of x + exp{—zBx) = 1. Equation (51) is equivalent to 


Suppose that (53) is false, that is, x < {B — q)/{B — q + 1). We then have 

^ q — 1 ln(l — x) ^ {B — q + 1) ln(i? — g + 1) 
B Bx ~ B{B — q) 


(53) 


(54) 


where the equality follows from x + exp(—zilx) = 1 and the inequality follows from the fact that 
X I—>• — is an increasing function on (0,1). Inequality (54) yields that B — q < ln(l + B — q), 

which is false (since B — q > 0), and hence we have a contradiction. This shows that (53) is true. □ 

Proof of Lemma 25. First note that g{z) is increasing and concave for z > IjB (this follows by 
Lemma 4 since F{z) = | + (1 — ]^)g{z) for z G (1/5,1]). If at > 1/5 and aj < 1/5 then we 
can decrease the value of (52) by decreasing a* and increasing aj (since g{z) = 0 for all z < 1/5 
and g{z) is increasing for z > 1/5). If 1/5 < < Oj then we can decrease the value of (52) by 

decreasing a, and increasing Oj (since g{z) is strictly concave for z > 1/5). Note that at least one 
Ui has to be strictly larger than 1/5. From the preceding discussion it follows that at a minimum 
one Oj is larger than 1/5 and the remaining aj’s are equal to 1/5. □ 

Proof of Lemma 23. Let a* be the number of vertices of color i in Xq, normalized by n. After 
the percolation step of the SW algorithm, with probability 1 — o(l) the largest component in 
color class i has size g{ai)n + o(n) (for a* < 1/5 the claim is still true since then g{ai) = 0). 
Suppose that in the coloring step of the SW algorithm all the largest components receive color 1 
(this happens with probability q~'^ = 0(1)). The remaining components have, with probability 
1 — o(l), size o(n) and hence, by Chernoff bound, with probability 1 — o(l) we end up in state 
(an + o{n), f3n + o(n), ..., f3n + o(n)) where fi = (1 — a)/{q — 1) and 


1 q-1 

Of — — - 

q q 


^ 9 Q 



(55) 


the last inequality follows from Lemmas 24 and 25 (note that the strict inequality comes from 
Lemma 24). From (55) we obtain j3 < 1/5. Let e = min((l/5 — /3)/2, (5 — q)/{2B)) > 0. We 
have that one color is e-heavy and the remaining colors are e-light. Note that this happened with 
probability q~‘^ + o(l) = 0 ( 1 )- 

The argument for the second statement of the lemma is almost identical, the only difference 
being that there is just one color class with a* > 1/5 and thus after the percolation step of the 
SW-dynamics with probability 1 — o(l) there is just one giant component of size g{ai){n — o{n)) 
and the rest have size o(n). □ 

After applying Lemma 23 the behavior of the Swendsen-Wang algorithm will be controlled by 
the function F (namely, St+i will be close to nF{St/n)) and then with constant probability after 
0 ( 1 ) steps the state will be close to the majority phase m. 

Lemma 26. Assume 5 > is a constant. For any > 0 and any starting state Xq after T = 
0(1) steps with probability 0(1) the SW-algorithm moves to state Xt such that ||q:(X 7 ’) —m||oo < d. 
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Proof. By Lemma 23 with probability 0(1) state Xi has one e-heavy color and the remaining q — 1 
colors are e-light. Assume that the largest color class in Xi has size zi. Now we do the next step of 
the SW algorithm. With probability 1 — o(l) in the percolation step the largest component of the e- 
heavy color has size g{z\)n[l-\-o{l)) and all the other components have size o(ti). Thus, by Chernoff 
bound, after the coloring step we have one color class of size F{zi)n{\ + o(l)) and the remaining 
color classes have roughly equal sizes, in particular, each is of size ((1 — F{zi))/q)n{l -|- o(l)). 

By the second part of Lemma 23, with probability 1 — o(l), the bigger color class is e- 
heavy and the smaller color classes are e-light. Note that F is increasing (cf. Lemma 4) and 
hence 1]) = {1/q), F^'^\l)]. Since B > *3^, by Lemma 3, it has a unique fix- 

point a. Hence, using also again that F is increasing, for constant T (depending on 5) we have 
[F^'^'>{1 /q), F^'^\l)] C [a — 6/2, a + 6/2]. Thus in T steps, with probability, 0(1) we are within 
t'oo-distance <5 of m. □ 

Then we show that once we are constant distance from m then in O(logn) steps the distance 
to m further decreases to 0 (n“^/^). 

Lemma 27. For B > there exist 6,L > 0 sueh that the following is true. Suppose that we 
start at a state Xq such that ||q:(Ao) — m||oo < 5- Then inT = O(logn) steps with probability 0(1) 
the SW algorithm ends up in a state Xt such that 

||Q:(Ar) — m||oo < (56) 


Proof. Let <5 > 0 be such that for some c < 1 for all z G [a — 5, o -|- 5] we have \F{z) — a\ < c\z — a\. 
Note that the existence of such 6 is guaranteed by the Jacobian attractiveness of the fixpoint a 
throughout the regime B > see Lemma 3. 

Suppose that we are in Xt with ||Q:(At) — m||oo < 6. Let L > 0 be a sufficiently large constant 
to be chosen later. If condition (56) is satisfied then we stop. Otherwise let Ci,C 2 ,.. ■ be all the 
connected components after the percolation step, sorted by decreasing size. By Lemma 14 we have 


i>2 


< K'n. 


Let rct > 1; we will fix the value of wt later. By Markov’s inequality 

P(y ^ \Ci\^ < wtK'n^ > 1 - l/wt. 

i>2 


(57) 


Assuming the event in (57), by Azuma’s inequality, in the coloring step of the SW algorithm the 
number Zi of vertices in C 2 U Ca ... that receive color i is concentrated around the expectation 

p(^\Zi - E[Zi]\ > wtV K'n^ < exp(-u;t/2). (58) 

Let z = St/n (recall that St is the size of the largest color class in Xt). The size of Ci is asymp¬ 
totically normal around g{z)n (see, [24]) and hence for some constant U > 0 

^(lie'll - giz)n\ > wty/n) < exp{-Uw/). (59) 

Combining (57), (58), and (59) we obtain that with probability at least 

(1 - l/wt)il - exp{-wt/2) - exp{-Uw/)) (60) 
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we have that 


a 


(Xt+i) - ( f ( z ), 


1 - F{z) 1 - F{z) 


We have 


{Hz), 


1 7 - - • 7 

q 

l-F{z) l-F{z) 


q-1 q-1 

Equations (61) and (62) combined yield 


1/00 

< c||q;(W) — m 




— m 


|q:(W+i) - m||oo < wt{l + '/lF)n + c\\a{Xt) - m||oo < 


c+ 1 , 


cx{Xt) - m||oo, 


(61) 

(62) 

(63) 


for 


1 — c II / N II 1 — c L 2 

wt '■= - i^WctiXi) — m oo >-> — 

2 1 +^ - 2 1 + /^ - 1 


— c 


where we achieve the last inequality by choosing L large enough {L := 4(1 + \/lC)/{l — c)). 

Assuming wt > max(l/t/, 10 ) we have (60) bounded from below by exp{—2/wt). Also note that 
assuming no “failures” (that is, (61) holds) we have that w/s are geometrically decreasing (with a 
factor at least (c + l)/2). Thus the probability that (61) is never violated (for all t until we stop) 
is at most 

]Jexp(- 2 /'u;t) > exp (^ - = ©(I)- 


t>i 


Finally, since the distance to m is geometrically decreasing (assuming no failures) and it is between 
^j^-i /2 have that the number of steps until we stop is O(logn). □ 


From Lemmas 21, 22, 26 and 27 we conclude the following. 


Corollary 28. Let B > ^rc be a constant. The mixing time of the Swendsen-Wang algorithm on 
the complete graph on n vertices is O(logn). 

Proof. Let e > 0 and consider two copies Xt,Yt of the SW-chain. We will show that for some 
T = O(logn), there exists a coupling such that Pr(Xr 7 ^ Yt) < e. 

Let (5, L be as in Lemma 27. By Lemma 26, for some Ti = 0(1) with probability 0(1) we have 
that 

||q:(Ati) - m||oo < d and ||Q:(yTi) - m||oo < d. (64) 

By Lemma 27, for some T 2 = O(logre) with probability 0(1), we have that 

\\ol{Xti+T 2 ) - m||oo < and ||q;(1ti+T2 ) - m||oo < . (65) 


Let T' := Ti +T 2 . Conditioning on (65), by Lemma 22 there exists a coupling that with probability 
0(1) for T 3 = T' + 1, it holds that ol{Xt^) = ol{Yt^). Conditioned on cx{Xt^) = (^{Yj’^), by 
Lemma 21, for every e' > 0 there exists T 4 = 0(log n) and a second coupling such that Pr(X'r 3 +r 4 / 
lj’ 3 _l_'r 4 ) < By letting s' f 0, we obtain a coupling and some T = O(logn) such that Pr(X' 7 ’ / 
Yt) < £, as wanted. □ 
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8 Fast mixing for B = ^rc 


The proof resembles the case B > *Brc, though we have to account more carefully for the mixing time 
of the chain for configurations which are close to uniform. In particular, for starting configurations 
which are e-far from being uniform, a straightforward modification of the proof for B > 5Sr-c gives 
that the SW-chain mixes rapidly. The main difficulty in the case B = is to show that the chain 
escapes from starting configurations which are close to uniform. We will show that this happens 
after roughly log n steps. More precisely, we have the following lemma. 

Lemma 29. Assume B = ^rc- There exists constants > 0 such that for any n and any initial state 
Xq with probability 0(1) after Ti = O(logn) steps, Xt^ has an s-heavy color and the remaining 
q — 1 colors are s-light. 

Lemma 29 yields the following analogue of Lemma 26 (note here the logarithmic bound on T). 

Lemma 30. Assume B = IB^c- For any 5 > 0 and any starting state Xq, after T = O(logn) steps 
with probability 0(1) the SW-algorithm moves to state Xt such that ||Q:(Xr) — m||oo < d. 

Proof of Lemma 30. From Lemma 29, for some (small) constant e > 0, we have that for Ti = 
O(logn), with probability 0(1), Xt.^ has an e-heavy color and the remaining g — 1 colors are e- 
light. Using Lemma 4 {F is increasing), the second part of Lemma 9 (the uniform fixpoint is 
Jacobian repulsive) and Lemma 10 (there exists a unique fixpoint of F in the interval (1/g, 1]), we 
obtain that for constant T 2 (depending on <5) we have /q, 1]) C [a —5/2, a+ 5/2], so the 

same arguments as in the proof of Lemma 26 yield that \\cx{Xti+T 2 ) ~ ^^lloo — wanted. □ 

Using Lemma 21 (note that it applies to all B > Q) and Lemmas 22 and 27 (note that these 
apply to all B > IB^), we may conclude the following from Lemma 30. 

Corollary 31. Let B = be a constant. The mixing time of the Swendsen-Wang algorithm on 
the complete graph on n vertices is O(logn). 

Proof. The proof is completeley analogous to the proof of Corollary 28, the only difference is that 
now we use Lemma 30 to argue that (64) holds with probability 0(1) for Ti = O(logn). □ 

We next turn to the proof of Lemma 29. We will use the following definition. For VU > 0, 
a state X will be called W-good if X has a VU-heavy color and the remaining q — 1 colors are 
(lU/2g)-light. 

Lemma 32. For any starting state Xq and an arbitrary constant w > 0, with probability at least 
p{w) > 0 (not depending on n) the next state Xi of the SW-dynamics is wn~^/^-good. 

Lemma 33. There exist absolute constants ci,C 2 ,C > 0 such that for all n sufficiently large the 
following holds. For all w such that ci < w < C 2 n^^^, for every -good starting state Xq, the 

next state of the SW-dynamics Xi is {12>/12)wn~^/^-good with probability at least exp{—C/w). 

Before proceeding, let us briefly motivate Lemmas 32 and 33. First, we explain the origin of 
the constant 13/12 in Lemma 33, whose value is somewhat arbitrary, any constant strictly smaller 
than 4/3 (and greater than 1) would work for all q > 3. To understand where the constant 4/3 
comes from, recall from Lemma 9 that the uniform phase u = 1/g is a Jacobian repulsive fixpoint 
of F (for B = *Brc) and, more precisely, F'(l/g) = 2{q — l)/q (note that F'{l/q) > 1 for all q > 2). 
Then, Just observe that ming>3{2(g — l)/q} = 4/3. 
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Thus, for any 4/3 > c > 1 (or, slightly less loosely, when F'{l/q) > c > 1), whenever 
OL{Xt) — u||^ is sufficiently small, for all sufficiently large n, one would expect that 


||o:(Xt+i) - u||^ > c \\oc{Xt) - u||^ . 

We show that this indeed holds, as long as ||Q!(Xt) — u||^ = by accounting carefully 

for color classes which are in the critical window for the percolation step of the SW-dynamics. 
Lemma 33 thus proves that an initial displacement of (which is guaranteed with constant 

probability from Lemma 32) increases geometrically. 

Lemma 29 follows immediately from Lemmas 32 and 33. 

Proof of Lemma 29. Let ci,C 2 ,C be the constants from Lemma 33. Define wt by wi = ci and 
wt = (13/12)tct_i. Moreover, let 0 < eo < C 2 and set to = Uog(eo^)/log(13/12)J. By Lemmas 32 
and 33, for any starting state Xq, the state Xt is wt-good for all t = 1,..., to with probability at 
least p(wi) nt =2 Note that the product in the expression for L is bounded by 

an absolute positive constant, since the series X]t>i converges. 

It follows that for e < £o/{10q), with positive probability (not depending on n) X^^ has an 
e-heavy color and the remaining q — 1 colors are e-light, as wanted. □ 


We next prove Lemmas 32 and 33. 


Proof of Lemma 32. We will write ai as a shorthand for a*(Wo), and denote m* = nai. In each 
step of the Swendsen-Wang algorithm, the percolation step for color i picks a graph Gi from 
G{mi, qat/mi). Let G^\ 6 * 2 *^ ... be the components of Gi in decreasing order of size. 

The beginning of the proof is analogous to the beginning of the proof of Lemma 41. Let A, L 
be the constants in Lemma 15 and let w > L. For each color i the following hold with positive 
probability (not depending on n): 

1. If qai > (1 — Am~^^^)fmi, then (by 

Lemma 15). 

2 . If (1 — Am- ^^^)fmi > qai, then Ylj>i ^ (by Item 1 of Lemma 14). 

Note that for at least 1 color we have qai > 1 (since the afs sum to 1). Let S' = {f G 
[g] : qai > 1} and consider all the components different from i G S. Color these components 
independently by a uniformly random color from [g]. Let Ai be the number of vertices of color i. By 
Azuma’s inequality and a union bound we have that with probability at least 1 — 2q ex.p{—50w‘^q), 
for each i G [q] it holds that 




q 


< {I0wq)n^^^. 


With probability at least q each of with i G S receives color 1. Let A'- be the number of 
vertices of color i after the coloring step of the SW-algorithm. Note, we have A'-^ = Ai + Yli&s 
and = Ai for i>2. We obtain that with probability at least q~'^[l — 2gexp(—50r(;^q)) > 0 


, w , n 
i > - + 

q 




{Wwq)n‘^/^ >^ + {mwq^)n^/^, 
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and for alH G {2,..., q} 


, w, n 1 

4 <- 

q q 


Y1 + {Wwq)n^/^ <-- {90wq)n'^/^. 

\i&S / ^ 


This concludes the proof. □ 

Proof of Lemma 33. W.l.o.g., we may assnme that the color classes Si, S 2 ., ■ ■ ■, Sq of Xq satisfy 

\Si\>— + wn^^‘^ and l^jl < — — for i G {2,..., g}. (66) 

q q 2 q 

Now we make a step of the Swendsen-Wang algorithm. Let Ci, C 2 ,..., be all the connected com¬ 
ponents after the percolation step of the Swendsen-Wang algorithm, listed in decreasing size. By 
Lemma 14 (first part for the color classes i = 2,... ,q and second part for the color class z = 1) we 
have 


E 


[ElG 


i>2 


< 


2iLn^/3 


w 


By Markov’s inequality 

By Lemma 14 (part 3) we have 

P 


i>2 


2K 

w 


< {3< K exp{—cq'^w^). 


(67) 


( 68 ) 


For all sufficiently large w, we may assume that the events in (67) and ( 68 ) occurred, that is, 
IC*!! > {fl/A)wn‘^/^ and Yli >2 Now we color the components C 2 , C 3 ,..., independently 

by a uniformly random color from [( 7 ] (for now we leave the component Ci uncolored). Let Ai be 
the number of vertices of color i. We have by Azuma’s inequality that 


P 


Ai- 


n — \Ci\ ^ wn 


2/3 


Q 


Aq 


< 2 exp{—w'^/{32q‘^)). 


(69) 


Now we color Ci, w.l.o.g., it receives color 1. Let A[ be the number of vertices of color i now (we 
have = Ai -|- ICil and A( = Ai for i > 2 ). Applying union bound to (69) we obtain that with 
probability at least 1 — 2gexp(—u;^/(32(7^)) we have 




V4 


Tl 13 2/1 

> —I- wrr'^, 

- q \2 


and for alH G {2,..., ( 7 } 


|A'| < ^ -u;n2/3(^(l/5) - 1/(4^)) < ^ - ^w/{2q)n^/^ 


(70) 


( 71 ) 


(Note that we used the fact that q > 3 in the second ineqnality in (70).) 

Let w' = {\3/12)w. Summarizing all the steps we obtain that from a state satisfying ( 66 ) we 
get to a state satisfying 


|S'i| >— Vw'r?!'^ and \Si\ < -for i G {2,..., ( 7 }, 

q q 2 q 


(72) 
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with probability at least 


— K exp(—^1 — 2q ex.p{—w'^/{32q‘^))^ . (73) 

For all sufficiently large w, the last expression is greater than exp{—C/w), where C is a positive 
constant (depending on K, c, q), as wanted. □ 

9 Lower bound on the mixing time for B > ^rc 

Recall from Section 5 that i3(v, 5) is the £oo-ball of configuration vectors of the g-state Potts model 
in Kn around v of radius 5, cf. equation (34). Let 

S := B{m, n“^/’^). 

We first establish the following (crude) bound on the probability mass of configurations in S in the 
Potts distribution. 

Lemma 34. Let B > iB^c- For the Potts distribution /i in (1), for all sufficiently large n, it holds 
that fi{S) < 1/8. 

Proof. For any starting state Xq, we have that for T = O(logn), it holds that 

Pr(Xr iS)>e, 

where e > 0 is a constant independent of re. (For B > this follows by Lemmas 26 and 27, and 
for B = *Brc this follows by Lemmas 30 and 27.) It follows that for all non-negative integers j it 
holds that 

Pr(X(,-+i)r G 5 I X,T ^S)>e. 

Further, by the second part of Lemma 19, for integer t > 0, it holds that 

Pr(W+i £S\Xt£ S)>1- exp(-Ll(re^/^)). 

We thus obtain that for some positive integer j = j{£), for all sufficiently large re, for all integer 
t > jT, it holds that 

Pr(W £ S)> 15/16. (74) 

Let T* = max{jT, 2 rmi^ }. Recall that = O(logre) (cf. Corollaries 28 and 31), so T* = 
O(logre) as well. Since Tmix is the time needed to get within total variation distance <1/4 from /r, 
we have that for any e' > 0, for t > logofl /eM. it holds that dTv{Xt, l) ^ (see [16, Section 
4.5]). Thus, we have that 

BS) — Pr(X 7 ’* £ S) < max |/x(^) — Pv{Xt* G A)| = dxvi^T*, h) ^ 1/16. (75) 

ylco 

Combining (74) and (75) yields /i(5) < 1/8, as wanted. □ 

Lemma 35. For B > there exist constants 61,62 > 0 such that the following is true. Suppose 
that we start at a state Xq such that Xq ^ S and 62 < ||q:(Xo) — mjjoo < 5i. Then for some 
T = n(logre), with probability > 1/2, it holds that Xt ^ S. 
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Proof of Lemma 35. Recall that m = {a,b,... ,b) where a > 1/g is a fixpoint of F. Let 5 > 0 be 
such that for some 0 < ci < Cu < 1 ior all z G [a — 6 , a + h] we have 


ci\z — a| < \F{z) — a| < Cujz — a 


(76) 


Note that the existence of such 6 is guaranteed throughout the regime B > since |-F^(a)| < 1 by 
Lemma 3, F'{a) > 0 by Lemma 4 and F' is continuous. Let hi, <52 be arbitrary constants satisfying 
0 < h2 < hi < h. 

Suppose that we are at Xt such that < ||Q:(Xf) — m||oo < h (note that for such Xt, 

we have Xt S'). Let mi be the number of vertices in the largest color class and note that 
mi/n = a + T =\ a' where |r| < h. Exactly as in the proof of Lemma 19 (cf. equation (39)), we 
obtain that with probability > 1 — 2 qexp{—Q(n^^^)) it holds that 


a{Xt+i) - (E(a'), 


1 - Ha') 

q -1 


q -1 


< n-F\ 


(77) 


Using (76), we have 


Ci\\oc{Xt) - m||oo < 



q -1 


^-F{F) \ 

q -1 ) 


— m 


< Cu\\cx{Xt) - m|| 


(78) 


Equations (77) and (78) combined yield that for all sufficiently large n we have the following two 
bounds: 


\a{Xt+i) - m||oo > ci\\a{Xt) - m|| 


— n > ^\\a{Xt) — m| 


(79) 


||Q:(Xt+i) - m||oo < n + c„||q;(Xi) - m||oo < h. ( 80 ) 

Let c' = —|/log(^). Applying (79) for t = 0,..., [c'lognj (note that (80) guarantees that we 
remain sufficiently close to m so that (79) indeed applies), we obtain that with probability 1 — o(l) 
it holds that 

||Q:(^c'logn) - m||oo > n“^/®||Q:(A'o) - m||oo > <52^“^/® > 

This completes the proof. □ 

Using Lemma 35, we obtain the following corollary. 

Corollary 36. Let B > *Brc- Then Tmix = U(logn). 

Proof. Let (5i, 82 be as in Lemma 35. Consider Xq such that Xq ^ S and 82 < ||q:(Ao) — m||oo < ^i- 
Then, by Lemma 35, for some T = il(logn) we have that 

Pr [Xt iS)> 1/2. 

On the other hand, by Lemma 34 we have that ^{S) < 1/8. It follows that 

dTv{XT,h) = max|/r(A) - Pr(A:T G A)\ > Pr(A:T G S) - n{S) > 1/2 - 1/8 > 1/4. 

Acn 

It follows from the definition of mixing time that Tmix > T, as claimed. □ 
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10 Fast mixing for B < 


The proof for establishing mixing in the uniqueness regime will be similar to the B > ^rc case. 
We begin with the following analogue of Lemma 23. 

Lemma 37. Assume B < ^rc is a constant. There exists e > 0 such that for any n and any initial 
state Xq with probability 0(1) the next state Xi has at least q — I colors that are e-light. 

Proof. Let e < 1/10 be small enough such that B(1 + 2e) < q. As in the proof of Lemma 23 
with probability 0(1) all the giant components receive color 1 (it could be that there are no 
giant components). By Chernoff bound, with probability 1 + o(l) colors 2,... ,q receive less than 
(l + e/ 2)/(7 fraction of vertices and since (1 + e/ 2 )/(l+ 2 e) < 1 — e we have that they are e-light. □ 


We then have the following lemma, which is an analogue of Lemmas 26 and 27 in the B > 
case, showing that we get within 0 (n“^/^) of the uniform phase. 

Lemma 38. Assume B < 53^ is a constant. There exists a constant L such that for any starting 
state Xq after T = 0(1) steps with probability 0(1) the SW-algorithm moves to state Xt such that 
||q:(At) - u||oo < 


Proof of Lemma 38. Let e be as in Lemma 37. By Lemma 37 starting from any Xq with constant 
probability we move to Xi where q—1 colors are e-light. W.l.o.g. let 1 be the remaining color. Let 
z be such that zn vertices have color 1. In the next step the only giant component (with probability 
l-|-o(l)) can arise from color 1. With probability l-|-o(l) we move to a state (F(z)n(l-|-o(l)), (1 — 
F{z))n{l + o{l))/q, ..., (1 — F{z))n{l + o{l))/q). Since 1/(7 is the only fixpoint of F we have that 
there exists constant T such that ^("^^([0,1]) C [1/q — e, 1/(7 -|- e]. With probability 1 -|- o(l) after 
at most T steps the size of the largest color class becomes less than 'i/q + (3/2)e. In the next step 
even the largest color class is subcritical and we end up (with probability 1 -|- o(l)) in a state where 
each color occurs (1 -|- o{l))n/q times. In the next step the components sizes after the percolation 
step satisfy, by Lemma 14 




0{n). 


I 


Hence, after the coloring step, with constant probability (using the same argument as in (57) 
and (58)) we have color classes of size (n -|- 0(n^/^))/g. □ 


To show that the mixing time of SW is 0(1) when B < 53^, we extend the strategy of [17] for 
g = 2 to (7 > 3. In [17], a certain projection of the SW chain is defined, called the magnetization 
chain. For us, the magnetization chain can be defined as follows. Let {Fi,..., be a fixed 
partition of the vertex set of the complete graph into q parts. The magnetization chain is a Markov 
chain At = {^ij,t)i,je[q] with Atj^t being the number of vertices in V) with color j at time t (the 
fact that the magnetization chain is a Markov chain is due to the symmetry). Note that for every 
t = 0,1,..., for every i G [q] it holds that Aij^t = |1/|. 

The following lemma is the analogue of [17, Proposition 7.3] and can be proved analogously to 
Lemma 22. 


Lemma 39. Let {Vi ,..., Vg} be a partition of the vertex set of the complete graph on n vertices 
into q parts. Let At and A!t be two copies of the magnetization chain. Further, denote by aj^t,o,'j^ 
the total number of vertices with color j in At and A[, respectively , i.e., 

^j,t ~ ^j,t — 

ie[q] je[g] 
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Let L > 0 be an arbitrarily large constant and suppose that at time t it holds that 
— n/ql < Ly/n, \a'jj. — n/q\ < Ly/n for all j G [( 7 ]. 

Then, there exists a coupling of such that with probability 0(1), it holds that At+i = 

Proof. The proof is completely analogous to [17, Proof of Proposition 7.3] and the proof of Lemma 22 
given earlier; namely, first perform the percolation step in each chain independently. Then, there 
is a constant c > 0 such that with probability 0 ( 1 ) in each partition Vi there are at least c|Vi| 
isolated vertices in each chain which can be used to equalize the counts Aij^t+i, Aij^t+i- D 

Using Lemmas 38 and 39, we conclude the following corollary. 

Corollary 40. Let B < 55^ he a constant. The mixing time of the Swendsen-Wang algorithm on 
the complete graph on n vertices is 0(1). 

Proof. Let vr be the stationary distribution of the Swendsen-Wang algorithm. Consider two copies of 
the SW-algorithm Xt and T), where Xq is an arbitrary starting configuration and Yq is distributed 
according to vr. It suffices to show that there is T = 0(1) and a coupling of Xt,Yt such that 
Xt = Yt with probability 11 ( 1 ). 

We will use the magnetization chain for an appropriate partition {Vi,..., 1^} of the vertices of 
the complete graph. Namely, for a color f G [g], let V) be the set of vertices with color i in Xq. Let 
At = {Aij^t}i,j^[q]-, A!i := be such that Aij^t, A'^j^ is the number of vertices with color j 

in Vi in Xt and Yt, respectively. The key idea is that, due to symmetry, the probability that the 
SW-chain at time t is at a particular configuration a depends only on the counts \ Vi n c“^(j)| for 
i G [g] and j G [gj. It follows that for every t, it holds that 

dTv{d^t,Yt) = dTv{At,A!i). (81) 

It thus suffices to show that for T = 0(1), there is a coupling of At and A!rp such that At = A!rp 
with probability 0 ( 1 ). 

Let L be the constant in Lemma 38. By Lemma 38, we have that for Ti = 0(1), with probability 
0 ( 1 ) it holds that 

\\cx{Xti) - u||oo < Ln~^/‘^, ||Q:(TrJ - u||oo < Ln~^l‘^. (82) 

Conditioned on (82), Lemma 39 shows that there exists a coupling of At^^+i and A!t^j^i such that 
with probability 0(1) it holds that At^+i = Using (81), we thus conclude that the mixing 

time of the Swendsen-Wang algorithm is 0(1), as wanted. □ 

11 Mixing Time at 5 = 25^ 

We will track the size of the largest color class. Roughly, our goal is to show that the chain reaches 
the uniform phase in 0 (n^/^) steps. 

11.1 Tracking one iteration of the SW-dynamics 

As a starting point, we have the following analogue of Lemma 23. 
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Lemma 41. For sufficiently small (constant) e > 0, for any starting state Xq of the SW-chain, 
with probability 0(1), there are at least q — 1 colors in state Xi which are e-light. Further, if state 
Xq has q — 1 e-light colors, then with probability 1 — exp(— the same is true for Xi. 


Proof. We will write a, as a shorthand for ai{Xt), and denote rrij = nai. In each step of the 
Swendsen-Wang algorithm, the percolation step for color i picks a graph Gi from G{mi, Bai/mi). 
Let Cj; , 6*2 ^ . be the components of Gi in decreasing order of size. 

Let A be the constant in Lemma 15. For each color i the following hold with positive probability 
(not depending on n): 


1. If Bai > (1 — Am- then Ylj>i ^ (by Lemma 15). 

2 . If (1 — Amf^^^)fmi > Bai, then Ylj>i (by Item 1 of Lemma 14). 

Let S' = {i G [(?] : Bai > 1} (note that the set S may be empty). Consider all the components 
(i) 

different from G\ , i G S. Color these components independently by a uniformly random color 
from [g]. Let A'^ be the number of vertices of color i. Let rc > 0 be a constant such that 1 > 
2 gexp(— 1 (;^/ 2 ). By Azuma’s inequality and a union bound we have that with probability at least 
1 — 2g exp(— 1 (;^/ 2 ) > 0, for each i G [g] it holds that 


A'- 




Wi 


< wn 


2/3 


With probability at least q each of with i G S' receives color 1. Let Aj be the number of 
vertices of color i after the coloring step of the SW-algorithm. Note, we have Ai = A'^ + 
and Ai = A'- for i > 2. We obtain that with probability at least q~’^{l — 2qexp{—w‘^/2)) > 0, for 
all i G [(?] 

|A'| < ” - ^ fy |CP|^ + W/3 < ^ + (83) 

^ J ^ 

Since < q, we have that for sufficiently small constant e > 0, for all n sufficiently large, it holds 
that |Aj| < (1 — e)n/B for all i / 1, and thus the colors 2,... ,q are e-light with probability 0(1) 
as wanted. 

For the second part of the lemma where we know that in Xq there are q — 1 colors which are 
e-light, the proof is analogous. The difference is that now we need upper bounds for the sum of 
squares of the components (other than the largest component — there can be at most one of those 
by the assumption) which hold with probability 1 — exp(—Note, for a color class i, we have 
the (crude) bounds 

P < nici^l and < n|C®|. (84) 

i>i i >2 

For each of the (<? — !) e-light colors, the first inequality in (84) together with Lemma 13 bounds the 
sum of squares of the components by with probability 1 — exp(—0(n^/^)). For the remaining 
color class (i.e., the one that we do not have an upper bound on its density by the assumption), to 
bound the sum of squares of the components we obtain the same bound by considering cases. 
If the color class is supercritical we use Lemma 11 and the second inequality in (84). If the color 
class is in the critical window we use Lemma 17 and the first inequality in (84). If the color class 
is subcritical we use Lemma 13 and the first inequality in (84). The only modihcation needed in 
the argument is to replace wr?!"^ in (83) by and the remaining part holds verbatim. □ 
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Let St be the size of the largest color class in state Xt of the SW-chain. The key part of our 
arguments is to track the evolution of St when there are {q — 1) e-light colors. 

We first do this in the easier case when St has density close to 1/B (in the complementary 
regime, we will need more statistics of St). In this regime, the following lemma roughly says that 
a step of the SW-dynamics makes the density of the largest color class roughly 1/q. (Intuitively, 
this follows by a “continuity” argument since F{l/B) = 1/q.) 

Lemma 42. Let £> 0 be a sufficiently small constant. Suppose that Xt is such that q — 1 
colors are e-light and that S'* < (1 -|- £)n/B. Then with probability 1 — exp(— it holds that 
St+i < (1 + 3q£)n/q. 

Proof of Lemma 42. The proof is analogous to the proof of Lemma 41 and as such we follow the 
notation in there. The only difference is that now we have to account slightly more accurately for 
the size of the largest color class in Xt+i. 

Assume that the q — 1 e-light colors in Xt are 2... ,q and assume w.l.o.g. that (the perhaps 
linear sized) gets colored with color 1 (in state Xt+i). The color classes of 2... in W are 
subcritical and thus fall into Item 2 of the analysis in the proof of Lemma 41. For the remaining 
color class 1 in At, it may fall either into Item 1 or 2. 

It follows that the bounds for A( in (83) still hold and in particular the colors 2 ,..., g have size 
at most {l/q)n + o{n) (since they did not receive a giant component). 

For the color class 1 in At+i, note that Ai = -|- A/ For all sufficiently small (constant) 

e > 0, the largest component with probability 1 — exp(—has size at most 3e{n/B) 

(by Item 3 of Lemma 14). Note that 55^ > 1 for all g > 3 (follows, e.g., by definition (3)) and 
hence 3e{n/B) < 3en. It follows that for all sufficiently large n, Ai is at most (1 -|- 3q£)n/q, as 
wanted. □ 


The following lemma gives some statistics of St/n throughout the range {1/B, 1], i.e., when the 
largest color class is supercritical in the percolation step of the SW-dynamics. Recall the function 
F defined in (6),(7). 

Lemma 43. Let e > 0 be an arbitrarily small constant and condition on the event that Xt has q—1 
colors which are e-light. Assume that (/ satisfies (1 -|- e)/B < (/n < 1. Let Z = E[St+i | = C]- 

For all constant e' > 0, for all sufficiently large n, it holds that 

nF{C,/n) — n^ < Z < nF{Q/n) + . (85) 

Also, there exist absolute constants Qi,Q 2 (depending only on e) such that 


nQ 2 < Var[St+i | S* = C] < nQi, 


( 86 ) 


Finally, for every integer k > 3 and constant e' > 0, there exists a constant c > 0 such that 


\st+i-znst = c 




(87) 


Proof. To avoid overloading notation, we assume throughout that we condition on St = C- 

We will write at as a shorthand for ai{Xt), and denote m* = nat. W.l.o.g. we will assume that 
the color class with largest size is the one corresponding to color 1, so that ai = C/^ > {l-\- e)/B. 
Since the remaining {q — 1) colors are e-light, for each i G {2 ,... ,q} we have Oi < (1 — e)/B. 

In each step of the Swendsen-Wang algorithm, the percolation step for color i picks a graph Gi 
from G{mi, Bai/mf). Let G^\ 6*2*^ ... be the components of Gi in decreasing order of size. Note 
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that Gi is in the supercritical regime, while G 2 , ■ ■ ■ ,Gq are in the subcritical regime. By Lemma 12, 
for every constant e' > 0 we have that 

E[\G{^^\]=f3C±C' =/3C±n^', ( 88 ) 

where /3 G (0,1) satisfies /3 + exp(—/3^) = 1. Note that 

nF{C/n) = - + ( 1 - 

q \ qJ 

Let Ai be the number of vertices with color i in Xt+i and w.l.o.g. assume that G^'^ receives the 
color 1 in the coloring step of the SW-dynamics. We will show that with probability 1 —exp(— 
it holds that St+i = Ai, so the estimates on the moments of St+i will follow from those of Ai. 

More precisely, with a scope to also prove (87), we will show that for every sufficiently small 
constant e' > 0 it holds with probability 1 — exp( 0 (n“*' )) that 

|Ai — nF(C/n)| < and Ai < {n — PC)/q + for z G {2,..., g}. (89) 

Since PC = f2(n), we will then obtain that Ai > Ai for all z 7 ^ 1. 

From Lemma 11 equation (28) (applied to color z = 1) and Lemma 13 (applied to colors 
i = 2 ,... ,q), with probability 1 — gexp(—©(n*^ )), we have 

for j > 2, \G^f^\ <rf for z G {2,..., q}, j > 1. (90) 

From Lemma 11 equation (29), with probability 1 — exp(—0(rz^ )), we also have 

-/3CI < (91) 

(Note that PC = fl(n).) Condition on the event that the bounds in (90) and (91) hold. From (90), 
we have the crude bound 


E(ic]‘’i)" + E EdG®!)" < «■+''. (92) 

j>2 g>i>2 j>l 


Consider now the coloring step of the SW-process. Consider all the components different from 
Color these components independently by a uniformly random color from [g]. Let A'- be the number 
of vertices of color z in this process. Note that Ai = and Ai = A'^ for z = 2,..., q. Using 

(92), by Azuma’s inequality we have that with probability 1 — 2gexp(—) for all z G [g] it holds 


that 




n- lu¬ 


ll) I 


< n 


l/2+e' 


(93) 


From (91) and (93) we obtain that for all sufficiently large n, with probability 1 — exp(—0(n^^)) 
it holds that St+i = Ai. It follows that Fl[S't+i] = D[Ai] + o(l) and i7[|S't_)_i — = 

Fi[|Ai — Fl[Ai]|^] + 0 ( 1 ) for all integer k >2. Thus, the bounds in (85), (86), (87) will follow from 


E[Ai] = nE{C/n) ± , 

Qin < Var[Ai] < Q 2 n, 
E[\Ai - E[Ai]\’^] < 
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(94) 

(95) 

(96) 





where A: > 3 is an integer, e' > 0 is an arbitrarily small constant, Qi, Q 2 > 0 are absolute constants 
and K is a, constant depending on k. 

Assuming (94) for the moment, note that (96) follows by integrating the first inequality in 
(89). We thus focus on proving (94) and (95) where we need more precise bounds. By the second 
inequality in (33) of Lemma 12 (applied to color 1) and part 1 of Lemma 14 (applied to colors 
i = 2,... ,q), we have for some constants Ki, K 2 , it's > 0 that 


Km < Var[\c[^^\] < ATsn, 


E[^{\cf\f+ ^(|cf|)2 

i>2 <?>*>2i>i 


< Km- 


(97) 


Denote by C the random vector j>i. 

tioned on C. We have 


We first estimate the moments of Ai condi- 


E[Ai\C] = IcfV 


n-\C- 


( 1)1 


n 

=-h 

Q 


1-- C- 


.( 1)1 


Var[Ai\C] 



E E(icfi)" 

- j>2 q>i>‘ij>‘i 


(98) 

(99) 


It follows from (98) that E[Ai] = ^ + (1 — ^).£)[|C'i^^|], so (94) follows from (88). Also, by the law 
of total variance we have Var[Ai] = l/ar[£'[Ai | C]] + E\yar'yAi | C]], so from (88),(97),(98), we 
obtain (95). 

This concludes the proof of Lemma 43. □ 


11.2 Upper bound on the mixing time at i? = 

In this section, we prove that the mixing time of the SW-algorithm satisfies Tmix = 0(n^/^) at 
B = ^u. 

The most difficult part of our arguments is to argue that the SW-chain escapes the vicinity of 
the majority phase, i.e., when the largest color class St is roughly na (recall that a is the marginal 
of the majority phase and satisfies E{a) = a). In particular, note that when St/n = a, from (85) 
the expected value of At+i/n is a as well. More generally, the drift of the process in the window 
I S'* — real < for some small e > 0 is very weak. An expansion of E around the point a yields 

that in this region nF{St/n) ^ St — c{St — an)‘^/n for some constant c > 0, so the change (in 
expectation) of St+i relative to St is roughly e^re^/^. In particular, how does the process escape 
this window? 

The rough intuition is that inside the window the variance of the process aggregates the right 
way, that is, a fter re^ /^ steps, the process is displaced by the square root of the “aggregate variance”, 
i.e., roughly VrAl'^n = re^/^. In the meantime, it holds that F{z) < z so St is bound to escape the 
window from its lower end. From that point on, the drift coming from the expectation of St (or 
else the function F) is sufficiently strong to take over and drive the process to the uniform phase. 

The easiest way to capture the progress of the chain towards the uniform phase is by a potential 
function argument. Namely, we show the following lemma in Section II. 4. 

Lemma 44. Let B = There exist constants Mi,M 2 ,r > 0 such that for all sufficiently small 
e > 0, for all sufficiently large re the following holds. There exists an increasing three-times differen¬ 
tiable potential function G : [I/g, I] —)• [0, Mire^/^] with G{l/q) = 0 and max^g[]^/g G'{z) < M 2 
such that for any C > (1 + £)n/B, if Xt has {q — I) colors which are e-light it holds that 

E[G{St+i/n) \St = C]< GiC/n) - r. (100) 
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The proof of Lemma 44 is quite technical, so let us briefly discuss the main ideas underlying 
the proof. The crucial ingredient is to specify the potential function G so that (100) is satished. To 
motivate the choice of G, by taking expectations in the second order Taylor expansion of G{St-\-i/n) 
around E[St+i/n | 5* = C] ^ F{(/n) we obtain 

E[GiSt+i/n) I = C] ~ G{E{C/n)) + ^Var[St+i/n | 5* = C] G"(F(C/n)). ( 101 ) 

(The precise conditions on the derivatives of G such that the approximation in (101) is sufficiently 
accurate are given in Lemma 51.) From (101), in order to satisfy (100), the function G has to 
be carefully chosen to control the interplay between G{F{x)) — G{x) and G”{F{x)). The hrst 
derivative of G should correspond to the drift F{x) — x of the process coming from its expectation 
while the second derivative of G to the variance of the process. More precisely, when x is outside 
the critical window, the choice of the potential function is such that G{F{x)) — G{x) is bounded 
above by a negative constant (i.e., its derivative is 1/{x — F{x)))', by our earlier remarks this should 
be sufficient to establish progress outside the critical window. Indeed, with this choice it turns out 
that \G"{x)\/n is bounded above by a small constant outside the critical window, so that (100) is 
satisfied. Inside the critical window, where x ~ F(x) and hence G{F{x)) — G{x) ~ 0, we choose 
G so that G''{x) is negative. More precisely, to satisfy (100), since I4or[5t_|_i/n | St = (] = 0(l/n) 
from Lemma 43, we set G"{x) = —Gn for some constant C > 0. The remaining part is then 
to interpolate between these two regimes keeping G'{x)/G"{x) sufficiently large (so that (100) is 
satisfied) and G{x) small (i.e., 0(n^/^)); this is possible due to the quadratic behaviour of F{z) — z 
around z = a. (See Lemma 53 and its proof for the explicit specihcation of G.) 

We next combine Lemmas 41, 42 and 44 to show the following. 


Lemma 45. For B = ‘fBu, there exists L > 0 such that the following is true. For any starting state 
Xq, inT = 0(re^/^) steps, with probability 0(1) the SW algorithm ends up in a state Xt such that 
||Q:(Wr) - u||oo < Ln~^G, 


Proof. Let e > 0 be a sufficiently small constant, to be picked later. We will assume that the state 
Xi has q — 1 e-light colors since (by the first part of Lemma 41) this event happens with probability 
0(1). Henceforth, we will condition on this event. 

Recall that St is the size of the largest color component at time t. We first prove that with 
probability 0(1) for some T = 0(n^/^) it holds that 5 t < (1 + e) n/B. Assuming this for the 
moment, then in the next step, i.e., at time T + 1, by Lemma 42 all color classes have size at most 
(1 + 2>qe)n/q and (for all sufficiently small e) are thus subcritical in the percolation step of the 
SW-dynamics. It follows that the components’ sizes after the percolation step satisfy, by Item I 


in Lemma 14, E 


Yli \Gi 


= 0{n). Hence, after the coloring step, using Azuma’s inequality with 

constant probability we have color classes of size (n -|- 0 (n ^/^))/(7 (see for example the derivation 
of (57) and (58) for details). 

It remains to argue that T = We will show in fact that T = 3Min^/^, where Mi is the 

constant in Lemma 44. Let Pt be the probability that at time t it holds that St < (1 + e)nfB. 0ur 
goal is to show that Pt = 0(1)- We will use Lemma 44 and the potential function G therein to 
bound Pt. In particular, we will show that for all n sufficiently large, for all f = 1,..., T, it holds 
that 

E[G{St+i/n)] < E[G{St/n)] - t (1 - Pt) + t/ 2 , (102) 


where r is the constant in Lemma 44. Prior to that, let us conclude the argument assuming (102). 
Note that if St < {l + £)n/B then St+i < (l + e)n/H with probability at least 1 — exp(—(by 
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Lemma 42), so Pt < Pt+i + 0{l/n). It thus follows from ( 102 ) that 

E[G{ST+i/n)] < E[G{Si/n)] - tT{1/2 - Pt) + o(l), 

which gives Pt > 1/2 — jT + o(l). For T = we thus have Pt > 1/10 as wanted. 

Finally, we prove (102) for t = 1,... ,r. Note that Lemmas 44 and 42 apply whenever Xt has 
q — 1 e-light colors, so we will need to account for the (small-probability) event that this fails. 
Namely, let £t denote the event that Xt has q — \ e-light colors. Since we condition on the event 
that £i holds, we have that holds with probability at least 1 — exp(—(by the second 

part of Lemma 41). 

Let Ft be the event that 5* < (1 - 1 - e)n/B and note that Pt = Pr(7't). By taking expectations 
in inequality (100) of Lemma 44, we have 

E[G{St+i/n) I £t, ^Ft] < E[G{St/n) \ £t, ^Ft] - r. (103) 

Note that if St < {1 + e)n/B, then by Lemma 42, with probability 1 — exp(—we have 
St+i < (1 -|- 3q£)n/q. From Lemma 44, we have G{l/q) = 0 and max^gji/^ G'{z) < M 2 where 
M 2 is an absolute constant independent of n. It follows that for all sufficiently small constant e > 0, 
when St+i < (1 -|- 3qe)n/q, it holds that G{St+i/n) < r/3. It follows that 

E[G{St+i/n)\£uFt] <t/3. (104) 

Note that G is positive throughout the interval [l/q, 1] since G{l/q) = 0 and G is increasing. By 
the positivity of G, we thus obtain the crude inequality 

Pr(-J) I £t) E[G{St/n) \ £t, ^Ft] < E[G{St/n) \ £t] (105) 

Let P/ be the probability that at time t it holds that Sj < (1 -|- £)n/B conditioned on the event £t, 
i.e.. Pi := Pr(Pi | £t). Note that Pt > P/(l — exp(—n^*^^))) > Pi — exp(—Combining (103), 
(104) and (105), we obtain 

E[G{St+i/n) I £t] < E[G{St/n) \ £t] - r(l - Pi) + r/3. (106) 

Since G is bounded by a polynomial and since the probability of the event ^£t is exponentially 
small, removing the conditioning in (106) only affects the inequality by an additive o(l). Similarly, 
replacing P/ with Pt in (106) only affects the inequality by an additive o(l). This proves that (102) 
holds for all sufficiently large n, thus concluding the proof of Lemma 45. □ 

Using Lemma 45, it is not hard to obtain the following corollary. 

Corollary 46. Let B = The mixing time of the Swendsen-Wang algorithm on the complete 
graph on n vertices is 0 {n^^^). 

Proof. Consider two copies (X*), (F)) of the SW-chain. As in the proof of Corollary 28, it suffices to 
show that for T = 0(n^/^), there exists a coupling of {Xt) and {Yt) such that Pi{Xt = Fr) = £4(1). 
By Lemma 45, for Ti = 0{n^/^), it holds that with probability 0(1) 

||Q:(Xrj) — u||oo < and ||Q!(lri) — u||oo < (107) 

Conditioning on (107), by an analogue of Lemma 22, there exists a coupling such that with prob¬ 
ability 0(1) for r 2 = Ti -|- 1, it holds that ol{Xt 2 ) = ^^(Frj). Conditioning on ol{Xt 2 ) = oc{Yt 2 ), 
by Lemma 21 there exists Ta = O(logn) and a coupling such that Pi{Xt 2 +T 3 = TT 2 +T 3 \ (^{Xt 2 ) = 
a{YT 2 )) = 14(1). It is now immediate to combine the couplings to obtain a coupling such that 
Pr{XT = Yt) = 14(1) with T = T 2 + T 3 = 0{n^/^), as desired. □ 
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11.3 Lower bound on the mixing time at i? = 

In this section, we prove that the mixing time of the SW-algorithm at B = 53^ satisfies Tmix = 

As in the proof of the upper bound, the lower bound on the mixing time follows by carefully 
accounting for the number of steps that the SW-algorithm needs to escape the window around the 
majority phase. Here, our goal in this section is to show that it takes steps to escape the 

window. The following lemma provides the “reverse” direction of Lemma 44. Recall that for a 
state Xt of the SW-algorithm, the size of the largest color class is denoted by St- 

Lemma 47. Let B = There exist constants Mi, M 2 , p > 0 such that for all sufficiently small 
e > 0, for all sufficiently large n the following holds. There exists a three-times differentiable 
increasing function G : [l/g, 1] —> [0,Min^/^] which satisfies G{l/B) = 0(1), 0(1) > M 2 n^/^ such 
that for any Q > n/q, if Xt has {q — 1) colors which are s-light it holds that 

E[G{St+i/n) I = C] > G(C/n) - p. (108) 

We remark here that the potential function in Lemmas 44 and 47 will be chosen to be identical. 
We thus refer the reader to the discussion after Lemma 44 for an overview of the construction of 
G and to Section 11.4 for the actual construction and the proof of Lemma 47. 

Analogously to Section 9, we will also need a (crude) bound on the probability mass of configu¬ 
rations which are far from the uniform phase in the Potts distribution. Recall from Section 5 that 
H(v, J) is the t’oo-ball of configuration vectors of the ( 7 -state Potts model in Kn around v of radius 
6, cf. equation (34). For a constant p > 0, let 

U{p) := H(u,r/). 

The following lemma is analogous to Lemma 34 and its proof hinges on the arguments used to 
derive the upper bound for the mixing time at B = 

Lemma 48. Let B = 53^ and p > 0 be a constant. For all sufficiently large n, the Potts distribution 
p (given in (1)J satisfies p{U{pf) < 1/8. 

Proof. For convenience, denote U := U{p). By Lemma 45, for all sufficiently large n and any 
starting state Xq, we have that for T = 0(n^/^), it holds that 

Pr(AT iU)>e, 

where e > 0 is a constant independent of n. It follows that for all non-negative integers j it also 
holds that 

Pr(X(,-+i)r G U I X,T iU)>e. 

Further, by Lemma 18, for integer t > 0, it holds that 

Pr(Xt+i eU \ XteU)>l- exp(-II(n^/3)). 

We thus obtain that for any starting state Xq, for some positive integer j = j(e), for all sufficiently 
large n, for all integer t > jT, it holds that 

Pr(At €U)> 15/16. (109) 

Let T* = max{jT, 2 Tmw }. By Corollary 46, we have Tmix = 0(n^/^), so T* = 0(n^/^) as well. 
The same arguments as in the proof of Lemma 34 (cf. equation (75)) yield 

p{U)-Pr{XT* GU)< 1/16. 

Combining (109) and (110) yields p{U) < 1/8, as wanted. 
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( 110 ) 

□ 



The following lemma can be derived from Lemma 47 by suitably adapting the proof of Lemma 45. 

Lemma 49. For B = 53^, there exists a constant rj > 0 such that the following is true for all 
n. Suppose that we start at a state Xq where all the vertices are assigned the color 1. Then for 
T = with probability > 1/2, it holds that Xt ^ U{r]). 

Proof. Recall that St is the size of the largest color class at time t. Let e > 0 be a sufficiently small 
constant, to be picked later. We will prove that, for some T = with probability > 1/2 it 

holds that 

Pr (St > (l + e)n/g) > 1/2. (Ill) 

Let 7] := e/q and note that ry is a constant. The lemma then follows by just observing that 
Pt{Xt i U{r])) > Pv{St > {l + e)n/q). 

We next argue that there exists T = n(n^/^) such that (111) holds. To do this, we will show 
that for all n sufficiently large, for all t = 0 ,..., n, it holds that 

E[G{St+i/n)] > E[G{St/n)] - 2p, (112) 

where G, p are the potential function and the constant from Lemma 47, respectively. Prior to 
proving ( 112 ), let us conclude the argument assuming ( 112 ). 

Lemma 44 asserts that there exist constants Mi, M 2 > 0 so that 

0 < G{z) < G(l) for all z G [1/g, 1], with G(l) = Gn^^^ and G satisfying M 2 < C < Mi. (113) 

Let T = M 2 n^/^/( 6 p) and note that T = 0(n^/^). From (112) for t = 0,..., T — 1, we obtain that 

E[G{ST/n)] > G{So/n) - 2pT, 

Since So = n and G(l) = it thus follows that E[G{ST/n)] > (2/3)C'n^/^. Let e > 0 be 

such that (1 + e)/q < l/B] such an e exists since at B = 53^ it holds that l/q < 1/B. From 
G{l/B) = 0(1) and the fact that G is increasing, we obtain that there exists a constant ^ > 0 such 
that 0((1 + e)/q) < f. It is immediate now to conclude that with probability > 1/2 it holds that 
S't > (1 + £)nfq] otherwise, using (113), we would have that for sufficiently large n, it holds that 
E[G{St/ n)] < {3/5)Cn^^^, contradicting our lower bound for E[G{ST/n)]. 

Finally, we prove (112) for t = 0,... ,n. We will use Lemmas 41 and 47. Let e > 0 be a small 
constant as in the statement of Lemma 41. Note that Lemma 47 applies whenever Xt has q — 1 
e-light colors, so we will need to account for the (small probability) event that this fails. Namely, 
let St denote the event that Xt has q — 1 e-light colors. Since the event £q holds (by the choice of 
the starting state Xq), we have that fllLoholds with probability at least 1 — exp(—(by the 
second part of Lemma 41). 

Let t be an integer between 0 and n. By taking expectations in inequality (108) of Lemma 47, 
we have 

E[G{St+i/n) I £t] > E[G{St/n) \ £t] - p. (114) 

Since G is bounded by a polynomial (cf. (113)) and the probability of the event ^£t is exponentially 
small, removing the conditioning in (114) only affects the inequality by an additive o(l). This proves 
that (112) holds for all sufficiently large n, thus concluding the proof of Lemma 45. □ 

Using Lemmas 48 and 49, we obtain the following corollary. 

Corollary 50. Let B = Q3„. Then Tmix = n(n^/^). 
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Proof. Let rj be as in Lemma 49 and let U := U{ri). Consider Xq where all the vertices are assigned 
the color 1. Then, by Lemma 49, for some T = we have that 

Pr {Xt iU)> 1/2. 

On the other hand, by Lemma 48 we have that ^{U) <1/8. It follows that 

dTv{XT,n) = max|/r(^) - Pi{Xt G A)\ > Pi{Xt £U)- h{U) > 1/2 - 1/8 > 1/4. 

It follows from the definition of mixing time that Tmix > T, as claimed. □ 

11.4 Constructing the potential function - Proof of Lemmas 44 and 47 

In this section, we prove Lemmas 44 and 47, i.e., construct the potential function G. We split the 
argument in several lemmas. 

The first lemma achieves two goals: first, it quantifies the bounds that the function G must 
satisfy so that the approximation 

E[G{St+i/n) \St = C]- G{FiC/n)) + '^Var[St+i/n | 5* = C] ^"(^(C/n)), (101) 

is valid; the bounds are given in (115). Second, it gives an inequality that the function G must 
satisfy (cf. equation (116)) which allows to deduce, using the approximation (101), the bounds on 
E[G{St+i/n) I S'* = C] — G{C/n) claimed in Lemmas 44 and 47 (see (117) below). 

Lemma 51. Let e > 0. Suppose that, for all n sufficiently large, St and St+i are random variables 
that satisfy (85),(86),(87) when C > (1 + £)n/B. 

Let G be a three-times differentiable potential function defined on the interval [!/<?, 1] such that 

mina;G'(x) > 0, max^, |G'(a:)| = max^, |G"(a:)| = 0(n), sup^, \G'''{x)\ = 0(n^/^). (115) 

Further, assume that for each x > 'i/B, it holds that 

-T 2 < G{E{x)) - G{x) + G”{E{x))Qi/i2n) < -n, 

-T 2 < GiEix)) - G{x) + G''{E{x))Q2/{2n) < -n, 

where ti,T 2 > 0 are constants (independent of n) and Qi,Q 2 are as in (86). 

Then, for any C > (1 + £)n/B, it holds that 

G{C/n) - 2 t 2 < E[GiSt+i/n) | St = C] < G{C/n) - n/2. 

Recall that for B = the function E(z) has exactly one fixpoint in the interval (1/R, 1] at 
z = a. The following lemma will be useful for the proof of Lemma 53. 

Lemma 52. Let B = Then it holds that 

1. E'{z) = 1 iff z = a. 

2. E"{z) < 0 for all z G (1/R, 1]. 

3. E{z) < z for all z G [l/R, 1] with equality iff z = a. 


(116) 


(117) 
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Proof. The proofs for the first two parts are given in Lemma 8 and Lemma 4 respectively. For the 
third part, note that the function z — F{z) is convex in [1/i?, 1] and has a unique critical point at 
z = a. Thus, z — F{z) > a — F{a) = 0 with equality z = a. □ 

The following lemma specifies a potential function G which will be used to verify the conditions 
(115) and (116) in Lemma 51. We have already described in Section 11.2, the high-level approach 
for the construction of G. The actual definition of G is quite technical due to the requirement that 
G should be three times differentiable. We pulled out the important bits in the construction of G 
that will also be relevant in verifying (116). 

For positive real numbers A, B we will use the notation B to denote that for some (large) 
constant C > 1, it holds that A > BG. 

Lemma 53. Let L,L' he positive constants which satisfy L ^ B. There exist positive constants 
M,Gq,Ci,G 2 , depending only on L such that the following holds. 

For all sufficiently large n, there exists a strictly increasing three-times differentiable function 
G : [1/q, 1] —)• with G{l/q) = 0 which satisfies (115) and 

\G'iz)\,\G"{z)\<GoforzG[l/q,l/B], 

G'{z) =- ^ for z G [1/il, a — U [o + 1], 

z-F[z) ( 118 ) 

G'{z) > \G''{z)\ < (10^C'i/L)n/or z G [a — , a — 

G''{z) < —C' 2 n for z € [a — a Ln~^^^]. 

Lemma 54. Let L,L' be positive constants which satisfy L^ L' ^ 1. Then, there exist constants 
ti,T 2 > 0, such that, for any function G satisfying (115) and (118), inequality (116) holds for every 
X > 1/B. 

Proof of Lemma 44- Let L, L' be positive constants satisfying L ^ L' ^ 1. By Lemmas 53 and 54, 
there exist constants M, ti , r 2 > 0 such that for all sufficiently large n there exists a three-times 
differentiable function G : [l/q,l/B] —)> [0, which satisfies both (115) and (116). Note 

that (115) guarantees that G is increasing. Further, by Lemma 53, it holds that G{l/q) = 0 and 
maXj,g[]^/g G'{z) < Cq where Gq is a constant. We will use this function G to prove Lemma 44 
with Ml = M, M 2 = Co and r = ri/2. 

Let e > 0 be a sufficiently small constant and suppose that Xt has [q — 1) colors which are 
e-light. Recall that St is the size of the largest color class in Xf. By Lemma 43, we have that for 
all sufficiently large n, for all C > (1 + e)n/B, the random variables St, St+i satisfy (85),(86),(87). 
It follows by Lemma 51 that 


E[G{St+i/n) \St = C]< G{C/n) - ri/ 2 . 

This completes the verification of all the conditions that G must satisfy, concluding the proof of 
the lemma. □ 

Proof of Lemma 4L. We begin by specifying some constants. Let eo > 0 be a constant such that 
(1 -|- eo)/B < a and let 

Wo := min {z — F(z)}. (119) 

zell/B,(l+eo}/B] 

By Lemma 52 and the choice of eO) we have that Wq > 0. 
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Consider positive constants L, L' satisfying L ^ L' ^ 1. By Lemmas 53 and 54, there exist 
constants M, ri, r 2 >0 such that for all sufficiently large n there exists a function G ■. [!/(?, l/B] —)■ 
[0,Mn^/^] which satisfies all of (115), (116) and (118). Further, by Lemma 53, it holds that 

Gil/q) = 0, max G'iz) < Cq, min G'iz) > 

zS[l/g,l/iJ] zg[a,a+Ln“i/3] 

where C'o,C '3 are constants. It is immediate to conclude from here that G{l/B) = 0(1) and 
0(1) > C^L'n}/^ (for the latter we also need that G is increasing which is guaranteed from (115)). 
We will also need a bound on the variation of G on the interval [!/<?, (1 + £)/B]. Using (118) and 
(119), we have that max 2 g[i/£^(i_|_£Q)/ 5 ] G'{z) < IjWQ. It follows that for t/q := maxlOo, l/Wo}, it 
holds that G'{z) < r]Q for all z G [1/g, (1 + £o)/B] and thus there exists a constant r] > 0 such that 

\G{zi) - G{z 2 )\ < rj for all zi,Z 2 G [1/q, (1 + £o)/B]. (120) 

We will use G to prove Lemma 47 with Mi = M, M 2 = G 3 L and p = 2t2 + 2r/. 

Let e > 0 be a sufficiently small constant and n be sufficiently large. Suppose that Xt has ((? — 1) 
colors which are e-light. Recall that St is the size of the largest color class in Xt and suppose that 
St = C where C > n/q. We will split the proof into cases depending on whether C ^ (1 + £)n/B. 

Consider first the case where C > (l + e)n/R. By Lemma 43, we have that the random variables 
St,St+i satisfy (85),(86),(87). It follows by Lemma 51 that 

E[G{St+i/n) I St = C] > G(C/n) - 2r2. 

Consider now the case where C ^ (1 + £)n/B so that St < (1 + £)njB. Let £t be the event that 
St+i < (1 + £)n/B. By Lemma 42, we have that Pr(Sf) = 1 — exp(—Also, using (120), we 
have that 

E[G{St+i/n) \St = C, £t] > G(C/n) - rj. (121) 

Recall that G is non-negative with values that are polynomially bounded. Since £t holds with 
exponentially large probability, it follows that removing the conditioning on the event £t in ( 121 ) 
only affects the inequality by o(l). Hence, for all sufficiently large n, it holds that 

E[G{St+i/n) \St = C]> G{C/n) - 2p. 

This completes the verification of all the conditions that G must satisfy, concluding the proof of 
the lemma. □ 

Proof of Lemma 51. Let x = (/n and y = E[St+i/n \ St = C]- Let Z = St+i/n — y. Note that Z is 
a random variable, E[Z | S* = C] = 0, and by Lemma 43, 

Qi/n < Var[Z | S* = C] = E[Z^ | S* = C] < Q 2 /n. 

By Taylor’s expansion, we have 

G{y + Z) = G{y) + G'{y)Z -\- G''{y) — + G"'{p) — , (122) 

2 6 

for some p which lies between y and y + Z (note that p is also a random variable). 

From condition (87) of Lemma 43 we have for all sufficiently small e' > 0 

\E[Z^\St = C]\ <Kn-^G+e\ 
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Taking expectations of (122) we obtain 


E[G{St+i/n) \St = C]= E[G{y + Z) \ St = (] = G{y) + + C, (123) 

where ICI < sup^, |G"'(x)| = o(l) since sup^, |G"'(x)| = 0(n^/^). Using (85) of Lemma 43, 

we have 

„ 1/10 „ 1/10 

\G{y) - G{F{x))\ < -sup|G'(x)| and \G''(y) - G"{F{x))\ < -sup|G'"(x)|. 

71 X 71 X 


Plugging these estimates in (123) we obtain 


F[GiSt+i/n) I = C] - G(F(x)) + G"(F(x)) 


E[Z‘^\St = C] 


<R, 


(124) 


where R is an error term satisfying 


\R\ < — sup |G^(x)| + — sup \G"\x)\ ^^^^ * ^ + C. 

77 X 77 X ^ 

From supj, |G'(x)| = sup^, \G"'{x)\ = 0(n^/^) and E[Z‘^ | -T* = C] < we obtain that 

1^1 = 0 ( 1 ). 

It thus follows from (124) that 

E\G{St+xln) I = C] - ^(C/n) = G{F{x)) - G{x) + G^^(F(x)) ^ + o(l). (125) 


We also have that 

G(fW) - G(.) + < G(FW) - GW + 


2 n 


< -r\ 


(126) 


where in the first inequality we used that Qijn < | 5t = C] < Q 2 ln (note that both estimates 

are needed since we do not know the sign of G") and in the second inequality we used (116). 
Analogously, one has 

0(F(U) - G(X) + > G(F(.)) - G(U + 

2 2n 

> -r 2 (127) 

Combining (125), (126) and (127), it follows that for all sufficiently large n it holds that 


- 2 x 2 < -T 2 + o(l) < E[G{St+i/n) | 5* = C] - G{C/n) < -n + o(l) < -ri /2 

This proves that (117) holds, as wanted. □ 

We next prove Lemmas 53 and 54. It is more instructive to use Lemma 53 as a black box for 
now and prove Lemma 54 first. 
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Proof of Lemma 54- Let L S> L' ^ 1 be constants and n be large. Also, let 

z-:= a — Zm'■= a — L'n~^^^, z+:= a + , 

and consider the intervals 

lo = l^/qA/B], h = [1/B,z-], l 2 = [Z-,Zm], h = [Zm,Z+], h = [z+,l]. 

Let G be a function defined on the interval [1/g, 1] that satisfies (115) and (118), i.e., 

mina; G'{x) > 0, maxa, |G'(x)| = max^, \G''{x)\ = 0(n), sup^, \G'''{x)\ = 0(n^/^), (115) 

and 

|G'(z)|,|G"(z)| ^CoforzE/o, 

G'{z) = -for z E Ii U/4 , 

z-F{z) (118) 

G'(z) > Gin2/3, |G"(z)| < (102Ci/L)n for z E I 2 , 

G''(z) < —G 2 n for z E I 3 , 

where recall that Gq, Gi, C 2 are positive constants. 

Our goal is to show that there exist constants ri,T 2 > 0, so that for all x E (1/5,1] it holds 
that 

-r 2 < G{F{x)) - G{x) + G"{F{x))Qr/{2n) < -n, 

-r 2 < G{F{x)) - G{x) + G''{F{x))Q2/{2n) < -n, ^ ^ 

where Qi,Q 2 are positive constants satisfying Q 2 <Qi- 

Let e > 0 be a small constant to be chosen later. We first show the inequality (116) in the 
easier regime where x ^ {a — e, a + e). Let 

Wi := min {x — F(x)}, 11 ^ 2 := max {x — F{x)}. 

irE/iU/4, x^(a—e,a-\-£) xE/iU/4, x^(a—e,a-\-£) 

By Lemma 52, for all x / a snch that x E Ii U /4 it holds that F(x) < x. Since x — F{x) is 
continuous, we obtain that Wi, W 2 > 0. It follows that G'(x) is bounded by the absolute constants 
1/VLi and I/VL 2 . Hence, there exist constants W[,W 2 > 0 such that for all x E Ii U I 4 and 
X ^ (a — e, a + e), it holds that 


-W!^ < G{F{x)) - G{x) < -Wl. 

A similar argnment shows that |G''(x)| is bounded by a constant for all x E [1/g, 1] with x ^ 
(a — £, a + s). It follows that 

max G(F(x)) - G(x) + G"(F(x))Qi/(2n) < -Wl + o(l), 

*e{i,2} 
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min G(F(x)) - G(x) + G"(F(x))Qi/(2n) > -W/ + o(l). 

*6(1,2} 

This proves (116) when x ^ (a — s, a + e). 

We next prove (116) when x E (a — £, a+e). We first specify an appropriate e, so that we can nse 
the expansion of F(z} around z = a. Let c = —F"{a)/2 and note that c > 0 by Lemma 52. Using 
again Lemma 52 and Taylor’s Theorem, there exists e" > 0 snch that for all z E {a — e",a + e"), it 
holds that 

F{z) = z — c{z — o)^ + 0((z — a)^). 
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Hence, there exists e' > 0 so that for all z G (a — e',a + s') it holds that 


-c(z — a)^ < z — F(z) < 2c(z — a)^, 

2 V ; - ^ - V ^ , ^^29) 

cjz — a| < \F'{z) — 1| < 4 c|2 ; — a\. 

Let e > 0 be a small constant such that s + 2ce^ < s' and 4ce, 4c^e^ < 1/8. As noted earlier, the 
choice of s will allow us to use the expansion of F{z) around z = a. Before we proceed, we give 
few intermediate inequalities that will be later used to establish the desired inequalities in (116). 

For X G (a — e, a + e), we will use the parametrization x = a + so that \K\ < sn^^^. 

From (129), we have that 

< X - F{x) < 2cK‘^n-^/^. (130) 

By the Mean Value Theorem, we also have that there exists ^ G {F{x),x) such that 

G{F{x)) - G{x) = G'{0{F{x) - x). (131) 

Since ^ G {F{x), x), we have by (130) that ^ = x — for some 1/2 < k < 2. By the choice 

of s, it follows that ^ G (a — e', a + s') and hence, using (129), we obtain — a)^ < ^ — F{^) < 
2c{^ — a)^. Using that 4ce, 4c^e^ < 1/8, we obtain 

C - F(0 < 4ciL2n"2/3. (132) 

Finally, for the lower bounds we will use sometimes the following immediate consequences of (115): 
there exist constants C(, > 0 such that for all x G [1/g, 1] it holds that 

0 < G'{x) < |G"(x)| < G'^n. (133) 


We are now ready to give the proof of (116) for x G {a — s,a + s). The proof splits into cases 
depending on the value of K in the parametrization x = o + 

Case I. K < —L OT K > L. We will do the case K < — L, the proof for K > L is analogous. Thus, 
we have that x G Ii. From (130), we also have F(x) G Ii. In fact, our choice of s guarantees that 
F{x) G {a — s',a + s'), where recall that s' is as in (129). 

For z G {a — s', a + s'), we have G'{z) = \/{z — F(z)) and thus G"{z) = From 

(129), we thus obtain that \G"{z)\ < Applying this for 2 ; = F{x) and observing that 

F{x) — a < —we obtain 

|G"(F(x))| < 

so that 

max |G"(F(x))Qi/(2n)| < 

Let ^ be as in (131). Since ^ G (F(x),x), we have from (118) that G'(^) = 1/(C — F(^)). From 
(132), we have l/(4cA'^n“^/^) < G'(^) < 4l(cK'^n~‘^^^). It follows from (130) and (131) that 

-8 < G(F(x)) -G(x) < -1/8. 
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Combining the above estimates, we can conclude that 


max GiFix)) - G(x) + G"(F(x))Qi/( 2 n) < - J 
min G{F{x)) - G{x) + G"{F{x))Qi/ (2n) > -8 - 


( 134 ) 


Since we can choose L to be an arbitrarily large constant, we can make the right-side quantities in 
(134) to be negative constants, as needed. This completes the proof for Case I. 

Case II. —L < K < —L'. In this case, we have x G l2- It follows from (130) that 

-^c(L')^n-2/3 > p^x) _x> -2cL^n-‘^/^. (135) 

Since x G / 2 , from (135), for all sufficiently large n, we clearly have that either F{x) G I 2 or 
F{x) G h. 

Suppose first that F{x) G h- From (118) we have that \G”{F{x))\ < {lO‘^Gi/L)n, so that 

max \G" (F{x))Qi/n\ < lO^GiQi/L. 
ie{l,2} 


Since F{x) G I 2 and ^ G {F{x),x), we have ^ € I 2 as well, so from (118) we have G'{^) > Gxr?!'^. 
We also have from (133) that G'(^) < G'p^!'^. Thus, together with (131) and (135), we obtain 

-2cL^G'^ < G{F{x)) - G{x) < -^c{FfGi. (136) 

It follows that 

max G(F(x)) - G(a:) + G"{F{x))Qi/{2n) < -Ci ( ^c{L'f - W^Qi/l) , 

*e{i,2} V2 / nQ'7^ 

min 
* 6 { 1 , 2 } 


G{F{x)) - G{x) + G"{F{x))Qi/{2n) > - [2cG[L^ + W^GiQi/L^. 


Suppose next that F{x) G Ii so that a — Ln > F{x). From the lower bound in (135), we 
obtain F{x) > a — Ln~^/^ — 2cL?‘n~‘^/^. Using that sup^, \G"'{x)\ = from (115), it thus 

follows that ||G"(F(x))| — |G"(a — Ln“^/^)| | = 0{in?^^). Moreover, using that max^, \G''{x)\ = 0{n) 
from (115) and ^ > F{x), we see that ||G'(^)| — |G'(o — Ln“^/^)|| = 0(n^/^). Combining these 
estimates yields again (137) (modulo a term o(l) which can be ignored for large n). 

Since we can choose L to be an arbitrarily large constant, we can make the right-side quantities 
in (137) to be negative constants, as needed. This completes the proof for Case II. 

Case III. -L' < K < L. In this case, we have x G I 3 . Observe that (135) holds in this case as 
well, so for all sufficiently large n, we have that either F{x) G I 2 or F{x) G I 3 . 

If F{x) G I 3 then from (118) and (133), we have —C^n < G"{F{x)) < —C' 2 n. Since G is 
increasing and F{x) < x, we trivially have G{F{x)) — G{x) < 0. The lower bound on G{F{x))—G{x) 
from (136) is valid in this case as well (since both (133) and (135) hold), so we obtain 

-2cL^C[ < G{F{x)) - G(x) < 0. 


It follows that 

max G{F{x)) - G(x) + G"{F{x))Qi/{2n) < -C 2 Q 2 , 

* 6 ( 1 , 2 } 

min G{F{x)) - G{x) + G”{F{x))Qi/{2n) > -2cL^C[ - C^Qi. 
* 6 ( 1 , 2 } 


(138) 
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If F{x) G I 2 then a — > F{x). From (135) we obtain F{x) > a — — 2cL‘^n~‘^/^ . 

It follows that |G(F(x)) — G{a — L'n~^/^)\ = o(l) and | |G"(F(x))| — |G"(a — L'n~^/^)\ \ = 0(n^/^), 
yielding again (138) (modulo a term o(l) which can be ignored for large n). 

The right-side quantities in (138) are negative constants, as needed. This completes the proof 
for Case III. 

We have shown that (116) holds for all x G [1/i?, 1], thus finishing the proof of Lemma 54. □ 

We conclude by giving the proof of Lemma 53. 

Proof of Lemma 53. Let L, L' be positive constants satisfying L ^ L'. To keep better track of the 
various subintervals involved in the construction of the function G, define L_,L_|_,Lm by setting 
L- = L+ = L and Lm = L' and note that L+,L_ ;§> Lm- Further, set 

z- := a — Zm ■= a — z-\- := a + 

We will define piecewise the function G(z) in the intervals 

lo = [l/q,l/B], Ii = [1/B,z-], l2 = [Z-,Zm], h = [Zm,Z+\, h = [z+,l]. 

Specifically, for j G {0,1, 2,3, 4}, let Gj{z) be a strictly increasing three-times differentiable function 
defined on the interval Ij which satisfies 

sup \Gj{z)\ = 0(n^/^), sup \G'j{z)\ = 0(n^/^), sup \G”{z)\ = 0(n), sup \G"'{z)\ = 0(n^/^). 

Z^Ij Z^Ij Z^Ij Z^Ij 

(139) 

For z G Ij, we will set G{z) = Gj{z) + wj, where the tCj’s are such that G{l/q) = 0 and G is 
well-defined on the interval (for example, wq = —Go(l/q), wi = —Gi(l/B) -|- Go(l/.B) — Go(l/q) 
and so on). Note, from (139), the wj’s satisfy \wj\ = 0(n^/^). 

The construction of G ensures that G(l/g) = 0, G is continuous and strictly increasing in the 
interval [1/q, 1]. From (139) and the fact that the tCj’s satisfy \wj\ = 0(n^/^), we also obtain that 
there exists a constant M > 0 such that G(z) < for all z G [1/g, 1]. 

The main part of the argument is to specify strictly increasing functions Gj so that: 

i. The properties (118) and (139) hold. 

ii. G is three-times differentiable. 

Provided that these conditions are met, we obtain that the function G also satisfies (115) (which 
completes the proof of the lemma). The roadmap of the construction is as follows: 

1. We first specify the functions Gi,G 4 . In particular, we will have 

G'i{z) = 1/ {z — F{z)) ioT z ^ Ii, G'^{z) = 1/ {z — F{z)) ioi z ^ Ii- (140) 

Gi,G 4 are strictly increasing three-differentiable functions which also satisfy (139). 

2. The derivatives of Go at z = 1/B need to match the derivatives of Gi at 2 ; = l/B, i.e., 

G'^{l/B) = G\{l/B), Gl{l/B) = G'{{l/B), G'"(l/i?) = G'f{l/B). (141) 

We will see that G'i{l/B), G'{{l/B), G'{'{l/B) are constants that do not depend n. Thus, Go 
can be chosen to be a function that does not depend on n whatsoever; any strictly increasing 
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three-times differentiable function which satisfies (141) will do. This yields that Go in fact 
satisfies the following bounds (which are stronger than those given in (139)): 


max|Go(2:)| = 0 ( 1 ), max|Go(2;)| = 0(1), max|Go(2)| = 0(1), max|Go'(z)| = 0(1). 

zs/o z£lo zs/o zS/o 

(142) 

3. The function 03(2) will be chosen to be quadratic. The requirement (144) will thus completely 
specify G3 (up to an additive constant). We will see that G'l{z+) is negative, so the function 
G3 will be concave. Our goal here is to ensure that for constants G2, O3 > 0 it holds that 


03(2;) < —C2n for 2; G I3, (143) 

G',{z+) = G',(z+), G"(z+) = G''(z+). (144) 

Note that G3 is clearly strictly increasing (from (143)) and three-times differentiable (since 
G3 is quadratic). G3 will also satisfy (139). 

4. The function G2 will satisfy the following constraints (in addition to (146)): 

G'2 (z) > Gm^O, |G'2'(.2)| < {10‘^Gi/L)n for z E h, (145) 

G'(z_) = G;(z_), G''( 2_) = G'((z_), (146) 

G'^iZm) = G'^iZm), G''{Zm) = G'^iZm), (147) 


where Gi is a positive constant. Note that G2 is clearly strictly increasing (from (145)). G2 
will also be three-times differentiable and it will satisfy (139). 

We will see that G)(2;_) < G'^{zm) and Gi{z-) > 0 > G'l{zm)- Recall that we also need that 
the first derivative of G2 is positive. Thus, the first derivative G '2 will increase overall in the 
interval I 2 , yet at the same time G^ should change monotonicity at some point inside the 
interval. 


Let us assume for now that the functions Gj satisfy all of the Items 1—4 and conclude that the 
function G satisfies Conditions i and ii. For Condition i, first observe that (139) is satisfied for 
all j E {0,1, 2,3,4} by Items 1 — 4. Also, equations (140), (142), (143) and (145) show that G 
satisfies (118). This proves that G satisfies Condition i. Relative to Condition i, using (141), 
(144), (146), (147) and the three-times differentiability of the Gj’s, we have that G is two-times 
continuously differentiable with a third derivative which exists everywhere apart (possibly) from 
the points z = Z-, Zm, 2;+. For each of these points, we interpolate G'” in an (infinitesimally) small 
neighborhood of the point using a steep linear function; the use of the linear function guarantees 
that the order of G'" is still 0(n^/^). The infinitesimally small length of the interpolation interval 
guarantees that the effect on G, G\ G" by this modification of G'" can safely be ignored. It follows 
that G satisfies Conditions i and ii, as wanted. 

It remains to obtain Items 1 — 4. We start with Item 1. 

To specify the functions Gi and G4, first consider a function h on the interval R U which 
satisfies h{l/B) = h{z+) = 0 and h'{z) = 1/(2; — ^(2;)) for 2; E /i U 14. This well-defines h on 
/1UI4. We then set Gi(2;) = h{z) for 2; E R and G4(2;) = h{z) for 2; E R. For 2; E RUR, note that 
2; > F{z) (using that z ^ a and Lemma 52) and thus h'{z) > 0, so Gi and G4 are stricly increasing. 
It remains to show (139) for j = 1,4. 

Note that 


^ F'jz) - 1 ^ 2{F'iz)-l)^ + F'iz)iz-F{z)) 

{z-F{z)Y' ^ ^ {z-F{z)f 


(148) 
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Let c := —F"{a)/2. By Lemma 52, we have that c > 0. By Taylor’s theorem, we have that for all 
sufficiently small e > 0, for all 2 in the interval / := (a — e, a + e), it holds that 

F{z) = z — c{z — a}^ + Rs^z) (149) 


for a remainder function R^{z) which satisfies max^g/ |i23(2;)| = 0(\z — ap). From (149), it also 
follows that 

F'{z) = 1 — 2c{z — a) + R 2 {z), R”{z) = —2c + Ri{z), 

for remainder functions Ri{z), R 2 {z) which satisfy max^jg/ |i2i(z)| = 0{\z—a\) and max^g/ \R 2 {z)\ = 
0{\z — ap). We thus obtain that there exist constants Ui, U 2 , U 3 > 0 such that for z G L\{a}, it 
holds that 


1 


z-F{z) 


(z — a) ^ < Uilz — a 


-1 


2, ,3 

(z-F(z)y ’ 


< U 2 \z — a 


-2 


2(F'(.)-l)^ + F"(.)(.-F (.)) _6 , <t,3|,_„|-3. 

[z — F[z)y c 


(150) 


Using (148) and (150), it is immediate to show that \h!{z)\ = 0{v?/^), max^g 7 ^u /4 \h''{z)\ = 

0{n), max 2 g/^u /4 \f^'"iz)\ = 0(n^^^) and thus these bounds carry over to Gi,Ga as well. We next 
show that max^g/^ h{z) = 0 (n^/^), the proof for max^gj-^j h{z) = 0 (n^/^) being completely analo¬ 
gous. 

In the interval z G [1/B,a — e], we have that h' is bounded above by an absolute constant 
throughout the interval, so we clearly have that h{a — e) — h{l/B) = 0(1). Consider next 2 ; G 
(a — £,Z-), and parameterize 2 ; as 2 ; = a — for some K which satisfies L_ < K < 

Using (150), we have the bound 


h'{z) < 


1 + eUi 
cK^ 


n^l\ 


Thus 


h{z-) — h{a — e) = 


< 



Kn-^/^)dK 


{l + £Ui)n^/^ 



1 


dK < 


for some absolute constant M. This concludes the construction for Item 1. 

For Item 2, we only need to show that G\{l/B), G'l{l/B), G'”{l/B) are constants. This is clear 
for G'i{l/B) and for G'{{1/B)^Gi {1/B) it follows from the expressions in (148) (note, using the 
method in Lemma 9, one can show that the right derivative of F at l/B is equal to 2(g — l)/g, 
while the right second derivative of F at \/B is equal to — ^^^ 3 ^^)- This yields Item 2. 

For Items 3 and 4, we will need the values of the derivatives of Gi and G 4 at the points 2 ;_ and 
2 ;+, respectively. Set 


d'_:=G;(z_), d'L-.= G'i{z.), d'+:=G'(z+), < := G([(z+). 

From the first two inequalities in (150), we obtain 

d'l 1 . d"_ 2 


lim 

n^oQ 77 , 


2/3 


cL^ ' 


r 

hm — = 3 , 

n^oo n cL 


d" 

lim — = 
n^oo 77 


cLl' 


(151) 
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From (151), we obtain that for all sufficiently large re, there exist D'^, D'^ > 0 such that 

d± = d'L = D'Ln, d" = -F>>, 

and 

^(1 - 10-5) <D'^<{1 + 10-5)^, ^(1 - 10-5) < II" < (1 + 10-5)^. (152) 

Note that D'^,D'^ depend on re, but as (152) shows they satisfy D'^,D'^ = 0(1). 

We are now ready to show Item 3. For z G I 3 , we will set G 3 {z) = uin‘^^^{z — a) + U 2 n{z — a)^ 

for ui,U 2 which we next specify. To satisfy (144), we will choose 

2 u 2 = -T»+, rei + 2re2L+ = (153) 

Observe that U 2 < 0, so G3 is not only a quadratic function but also concave. Note that rei,U2 
satisfy |ni|, |re2| = 0(1) from where it easily follows that (139) is satisfied (for j = 3). For (143), 

just observe that U 2 , as defined in (153), is a negative constant and hence the bound on Gg is 

immediate. This completes the construction for Item 3. 

For the construction in Item 4, we will need a handle of the derivatives of G3 at the endpoint 
Zm of the interval I 3 (we will also use these later in the construction for Item 4). Let 

:= G'^{z^)It?I^ and := -G'i{zm)ln. 


We will show that 


D'^ = {1± 10 


- 4 \ 


cL^ 


Dl = {l± 10 


- 4 \ 


cL|' 


(154) 


By the definition of we have that D'^ = ui — 2u2Lm and hence, by the choice (153) of rei,re2, 
we have 


D'^ = D'^ + DliL+ + L^). 


Also, we have D'^ = H" since the function G3 is quadratic. It is immediate thus to conclude (154) 
using the bounds in (152) and L± Lm- 

We are now ready to give the construction for Item 4. To define the function G 2 {z) on the 
interval I 2 , we will set 

G2{z) = n^/^g{n^/\z-a)), 

where 5^ is a three times differentiable function on the interval I := [—T_, —T^] such that 


g'{-L.) = D'_, 
g {—Lm) = 


min5r^(x) > 

X&I 


1 


g"{-L.) = D'L, 
g”{-Lm) = -Dl, 

max I/(x) I < 
x&l cL'± 


(155) 

(156) 

(157) 


Equations (155), (156) and (157) ensure that the function G 2 satisfies (145), (146) and (147). Also 
it will be clear from the specification of g that all of g, g', 5", g'" are bounded by absolute constants, 
which thus implies that G 2 satisfies (139) (for j = 2). 

It remains to specify such a function g, we do this by specifying its second derivative. More 
precisely, for z € I, we will set 

g'{z) := D'_ + J h{x)dx, so that g"{z) = h{z), (158) 
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where h{z) is a differentiable function on I satisfying 


h{-L_) = D'L, h{-Lm) =-D';,, 



h{x)dx = D'^ — D'_, 


max |/i(x)| < 

x£l 


25 



h{x)dx > 0 for all z G I. 


(159) 

(160) 


Using (159) and (160), it is immediate to verify that the function g, as specified in (158), satisfies 
(155), (156) and (157) (for the first inequality in (157), note that g'{z) > D'_ for all z G / and then 
use the bound for D'_ from (152)). 

To specify the function h, we will need two parameters Ki,K 2 > 0 such that 


-L_ < -Ki < -K2 < -Lm- 


(161) 


We will specify the parameters Ki , K 2 shortly but for now it will be more instructive to assume 
that Ki, K 2 just satisfy (161); the freedom to specify Ki,K 2 will be helpful at a slightly later point. 
So, consider the function h defined on [—T_, —Lm] by 


h{z) 


i|g=^(z + i^l)2(z + i^ 2 )^ if-K,<z<-K 2 

+ -K2<Z<-Lm 


Note that 

h{-Lm) = -Dl, h{-L_) = , (162) 

and that h is differentiable throughout the interval [—T-, —Lm] since at the points z = —Ki, —K 2 
it holds that h[—Ki) = h{—K 2 ) = h'{—Ki) = h'{—K 2 ) = 0. Further, by a direct calculation, the 
function h satisfies the following: 

r-Ki f-K2 in f-Lm r)fr 

/ h{z)dz=^{L.-K,), / h{z)dz=-{D'm-D'_), / h{z)dz = -^{K 2 -Lm). 

J-L. -3 J-Ki y J-K2 

(163) 

Since D'm > D'_ > 0 and D'^,Dm > 0 (cf. (152) and (154)), we also have that 

0 < h{z) < D'L for z G [-L_, -ivTi], 

° - ^48(^r- K 2 ) ^ ^ 

-D'm < h{z) < 0 for z G [-K 2 , -Lm]- 

It follows that 

max|h(z)| < M, where M := max |l)" , — K~^ \ (^6^) 

It remains to choose Ki,K 2 satisfying (161) so that the specifications for h in (159) and (160) 
are satisfied. We set 

= = + ( 166 ) 
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Using (152) and (154) and L± » L^, we have 


ii:i = (g±10-3)L_, i^ 2 = (2±10-3)l+. (167) 

Since -L+ = L_ Lm, we obtain that Ki,K 2 satisfy (161) as desired. 

We next check that the specifications for h in (159) and (160) are satisfied. First, combining 
(163) and (166), we obtain that 


— Lm 

h{z)dz = D'^ — D'_. (168) 

Equations (162) and (168) show that h does indeed satisfy (159). 

We next show that h satisfies the inequalities in (160). To show that max^ 6 ,|h(z)| <25 /(cL 3) 
it suffices to show that M < 25/(cL^), where M is as in (165). This is immediate to verify using 
(152), (154) and (167). For the second inequality in (160), note from (164) that the function 
h{z) is non-negative when z < —K 2 and negative when 2 ; > —7^2- Thus, it suffices to check the 
inequality in (160) when 2 ; = —L_ and 2 ; = —Lm- For 2 ; = —T_, the inequality holds (trivially) at 
equality while for 2 ; = —L^ the inequality follows from (168) and D'^ > D'_. This completes the 
construction for Item 4. 

We have thus shown how to do the construction of the functions Go, Gi, G 2 , G 3 , G 4 so that 
Items 1—4 hold, completing the proof of Lemma 53. □ 
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